Toby Ord on the perils of maximising the good that you do

By Robert Wiblin and Keiran Harris · Published September 8th, 2023

Toby Ord on the perils of maximising the good that you do

By Robert Wiblin and Keiran Harris · Published September 8th, 2023

Enjoyed the episode? Want to listen later? Subscribe here, or anywhere you get podcasts:

One thing that you can say in general with moral philosophy is that the more extreme theories which are less in keeping with all of our current moral beliefs — are also less likely to encode the prejudices of our times. We say in the philosophy business that they’ve got more “reformative power” … But that comes with the risk that we will end up doing things that are … intuitively bad or wrong — and that they might actually be bad or wrong. So it’s a double-edged sword… and one would have to be very careful when following theories like that.
Toby Ord

Effective altruism is associated with the slogan “do the most good.” On one level, this has to be unobjectionable: What could be bad about helping people more and more?

But in today’s interview, Toby Ord — moral philosopher at the University of Oxford and one of the founding figures of effective altruism — lays out three reasons to be cautious about the idea of maximising the good that you do. He suggests that rather than “doing the most good that we can,” perhaps we should be happy with a more modest and manageable goal: “doing most of the good that we can.”

Toby was inspired to revisit these ideas by the possibility that Sam Bankman-Fried, who stands accused of committing severe fraud as CEO of the cryptocurrency exchange FTX, was motivated to break the law by a desire to give away as much money as possible to worthy causes.

Toby’s top reason not to fully maximise is the following: if the goal you’re aiming at is subtly wrong or incomplete, then going all the way towards maximising it will usually cause you to start doing some very harmful things.

This result can be shown mathematically, but can also be made intuitive, and may explain why we feel instinctively wary of going “all-in” on any idea, or goal, or way of living — even something as benign as helping other people as much as possible.

Toby gives the example of someone pursuing a career as a professional swimmer. Initially, as our swimmer takes their training and performance more seriously, they adjust their diet, hire a better trainer, and pay more attention to their technique. While swimming is the main focus of their life, they feel fit and healthy and also enjoy other aspects of their life as well — family, friends, and personal projects.

But if they decide to increase their commitment further and really go all-in on their swimming career, holding back nothing back, then this picture can radically change. Their effort was already substantial, so how can they shave those final few seconds off their racing time? The only remaining options are those which were so costly they were loath to consider them before.

To eke out those final gains — and go from 80% effort to 100% — our swimmer must sacrifice other hobbies, deprioritise their relationships, neglect their career, ignore food preferences, accept a higher risk of injury, and maybe even consider using steroids.

Now, if maximising one’s speed at swimming really were the only goal they ought to be pursuing, there’d be no problem with this. But if it’s the wrong goal, or only one of many things they should be aiming for, then the outcome is disastrous. In going from 80% to 100% effort, their swimming speed was only increased by a tiny amount, while everything else they were accomplishing dropped off a cliff.

The bottom line is simple: a dash of moderation makes you much more robust to uncertainty and error.

As Toby notes, this is similar to the observation that a sufficiently capable superintelligent AI, given any one goal, would ruin the world if it maximised it to the exclusion of everything else. And it follows a similar pattern to performance falling off a cliff when a statistical model is ‘overfit’ to its data.

In the full interview, Toby also explains the “moral trade” argument against pursuing narrow goals at the expense of everything else, and how consequentialism changes if you judge not just outcomes or acts, but everything according to its impacts on the world.

Toby and Rob also discuss:

The rise and fall of FTX and some of its impacts
What Toby hoped effective altruism would and wouldn’t become when he helped to get it off the ground
What utilitarianism has going for it, and what’s wrong with it in Toby’s view
How to mathematically model the importance of personal integrity
Which AI labs Toby thinks have been acting more responsibly than others
How having a young child affects Toby’s feelings about AI risk
Whether infinities present a fundamental problem for any theory of ethics that aspire to be fully impartial
How Toby ended up being the source of the highest quality images of the Earth from space

Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type ‘80,000 Hours’ into your podcasting app. Or read the transcript.

Producer and editor: Keiran Harris
Audio Engineering Lead: Ben Cordell
Technical editing: Simon Monsour
Transcriptions: Katy Moore

Highlights

Maximisation is perilous

Rob Wiblin: Yeah. So what goes wrong when you try to go from doing most of the good that you can, to trying to do the absolute maximum?
Toby Ord: Here’s how I think of it. Even on, let’s say, utilitarianism, if you try to do that, you generally get diminishing returns. So you could imagine trying to ramp up the amount of optimising that you’re doing from 0% to 100%. And as you do so, the value that you can create starts going up pretty steeply at the start, but then it starts tapering off as you’ve used up a lot of the best opportunities, and there’s fewer things that you’re actually able to bring to bear in order to help improve the situation. As you get towards the end, you’ve already used up the good opportunities.
But then it gets even worse when you consider other moral theories — if you’ve got moral uncertainty, as I think you should — and you also have some credence that maybe there are some other things that fundamentally matter apart from happiness or whatever theory that you like most says. There are these tradeoffs as you optimise for the main thing; there can be these tradeoffs to these other components that get steeper and steeper as you get further along.
So maybe, suppose as well as happiness, it also matters how much you achieve in your life or something like that. Then it may be that many of the ways that you can improve happiness, let’s say in this case, involve achievements — perhaps achievements in terms of charity, and achievements in terms of going out in the world and accomplishing stuff. But as you get further, you can start to get these tradeoffs between the two, and it can be the case for this other thing that it starts going down. If instead we were comparing happiness first and then freedom, maybe the ways that you could create the most happiness involve, when you try to crank up that optimisation right to 100%, just giving up everything else if need be. So maybe there could be massive sacrifices in terms of freedom or other things right at the end there.
And perhaps a real-world example to make that concrete is if you think about, say, trying to become a good athlete: maybe you’ve taken up running, and you want to get faster and faster times, and achieve well in that. As you start doing more running, your fitness goes up and you’re also feeling pretty good about it. You’ve got a new exciting mission in your life. You can see your times going down, and it makes you happy and excited. And so a lot of metrics are going up at the start, but then if you keep pushing it and you make running faster times the only thing you care about, and you’re willing to give up anything in order to get that faster time, then you may well get the absolute optimum. Of all the lives that you could live, if you only care about the life that has the best running time, it may be that you end up making massive sacrifices in relationships, or career, or in that case, helping people.
So you can see that it’s a generic concept. I think that the reason it comes up is that we’ve got all of these different opportunities for improving this metric that we care about, and we sort them in some kind of order from the ones that give you the biggest bang for their buck through to the ones that give you the least. And in doing so, at the end of that list, there are some ones that just give you a very marginal benefit but absolutely trash a whole lot of other metrics of your life. So if you’re only tracking that one thing, if you go all the way to those very final options, while it does make your primary metric go up, it can make these other ones that you weren’t tracking go down steeply.
Rob Wiblin: Yeah, so the basic idea is if there’s multiple different things that you care about… So we’ll talk about happiness in life versus everything else that you care about — having good relationships, achieving things, helping others, say. Early on, when you think, “How can I be happier?,” you take the low-hanging fruit: you do things that make you happier in some sensible way that don’t come at massive cost to the rest of your life. And why is it that when you go from trying to achieve 90% of the happiness that you could possibly have to 100%, it comes at this massive cost to everything else? It’s because those are the things that you were most loath to do: to just give up your job and start taking heroin all the time. That was extremely unappealing, and you wouldn’t do it unless you were absolutely only focused on happiness, because you’re giving up such an incredible amount.
Toby Ord: Exactly. And this is closely related to the problem with targets in government, where you pick a couple of things, like hospital waiting times, and you target that. And at first, the target does a pretty good job. But when you’re really just sacrificing everything else, such as quality of care, in order to get those people through the waiting room as quickly as possible, then actually you’re shooting yourself in the foot with this target.
And the same kind of issue is one of the arguments for risk from AI, if we try to include a lot of things into what the AI would want to optimise. And maybe we hope we’ve got everything that matters in there. We better be right, because if we’re not, and there’s something that mattered that we left out, or that we’ve got the balance between those things wrong, then as it completely optimises, things could move from, “The system’s working well; everything’s getting better and better” to “Things have gone catastrophically badly.”
I think Holden Karnofsky used this term “maximization is perilous.” I like that. I think that captures both what’s one of these big problems if you have an AI agent that is maximising something, and if you have a human agent — perhaps a friend or you yourself — who is just maximising one thing. Whereas if you just ease off a little bit on the maximising, then you’ve got a strategy that’s much more robust.

How moral uncertainty protects against the perils of utilitarianism

Toby Ord: One thing that you can say in general with moral philosophy is that the more extreme theories — which are, say, less in keeping with all of our current moral beliefs — are also less likely to encode the prejudices of our times. So what we say in the philosophy business is that they’ve got more “reformative power”: they’ve got more ability to actually take us somewhere new and better than where we currently are. Like if we’ve currently got moral blinkers on, and there’s some group who we’re not paying proper attention to and their plight, then a theory with reformative power might be able to help us actually make moral progress. But it comes with the risk that by having more clashes with our intuitions, we will end up perhaps doing things that are more often intuitively bad or wrong — and that they might actually be bad or wrong. So it’s a double-edged sword in this area, and one would have to be very careful when following theories like that.
Rob Wiblin: Yeah. So you say in your talk that for these reasons, among others, you couldn’t embrace utilitarianism, but you nonetheless thought that there were some valuable parts of it. Basically, there are some parts of utilitarianism that are appealing and good, and other parts about which you are extremely wary. And I guess in your vision, effective altruism was meant to take the good and leave the bad, more or less. Can you explain that?
Toby Ord: Yeah, I certainly wouldn’t call myself a utilitarian, and I don’t think that I am. But I think there’s a lot to admire in it as a moral theory. And I think that a bunch of utilitarians, such as John Stuart Mill and Jeremy Bentham, had a lot of great ideas that really helped move society forwards. But in part of my studies — in fact, what I did after all of this — was to start looking at something called moral uncertainty, where you take seriously that we don’t know which of these moral theories, if any, is the right way to act.
And that in some of these cases, if you’ve got a bit of doubt about it… you know, it might tell you to do something: a classic example is if it tells the surgeon to kill one patient in order to transplant their organs into five other patients. In practice, the utilitarians tend to argue that actually the negative consequences of doing that would actually make it not worth doing. But in any event, let’s suppose there was some situation like that, where it suggested that you do it and you couldn’t see a good reason not to. If you’re wrong about utilitarianism, then you’re probably doing something really badly wrong. Or another example would be, say, killing a million people to save a million and one people. Utilitarianism might say, well, it’s just plus one. That’s just like saving a life. Whereas every other theory would say this is absolutely terrible.
The idea with moral uncertainty is that you hedge against that, and in some manner — up for debate as to how you do it — you consider a bunch of different moral theories or moral principles, and then you think about how convinced you are by each of them, and then you try to look at how they each apply to the situation at hand and work out some kind of best compromise between them. And the simplest view is just pick the theory that you’ve got the highest credence in and just do whatever it says. But most people who’ve thought about this don’t endorse that, and they think you’ve got to do something more complicated where you have to, in some ways, mix them together in the case at hand.
And so while I think that there is a lot going for utilitarianism, I think that on some of these most unintuitive cases, they’re the cases where I trust it least, and they’re also the cases where I think that the other theories that I have some confidence in would say that it’s going deeply wrong. And so I would actually just never be tempted in doing those things.
It’s interesting, actually. Before I thought about moral uncertainty, I thought, if I think utilitarianism is a pretty good theory, even if I feel like I shouldn’t do those things, my theory is telling me I have to. Something along those lines, and there’s this weird conflict. Whereas it’s actually quite a relief to have this additional humility of, well, hang on a second, I don’t know which theory is right. No one does. And so if the theory would tell you to really go out on a limb and do something that could well be terrible, actually, a more sober analysis suggests don’t do that.

The right decision process for doing the most good

Toby Ord: Many people think that utilitarianism tells us, when we’re making decisions, to sit there and calculate, for each of the possible options available to you, how much happiness it’s going to create — and then to pick the one that leads to the best outcome. Now, if you haven’t encountered this before, you may think that’s exactly what I said earlier that utilitarianism is, but I hope I didn’t make this mistake back then, and I think I probably got it right.
So naive utilitarianism is treating the standard of what leads to the best happiness as a decision procedure: it’s saying that the way we should make our decisions is in virtue of that. Whereas what utilitarianism says is that it’s a criterion of rightness for different actions — so it’s kind of the gold standard, the ultimate arbiter of whether you did act rightly or wrongly — but it may be that in attempting to do it, you systematically fail.
And this can be made clear through something called the “paradox of hedonism”: where, even just in your own life, suppose you think that having more happiness makes your life go better, and so you’re always trying to have more happiness. And so every day when you get up you’re like, “What would make me happy today?” And then you think, “Which of these breakfast cereals would make me happiest?” And then you’re having it and you’re like, “Would chewing it slower make me happier?” And so on. Well, you’re probably going to end up with less happiness than if you were just doing things a bit more normally. And it’s not really a paradox; it’s just that constantly thinking about some particular standard is not always the best way to achieve it.
And that was known to the early utilitarians. In fact, they wrote about this quite eloquently. They suggested that there could be other decision procedures which are better ways of making our decisions. So it could be that even on utilitarian standards, more happiness would be created if we made our decisions in some other way. Perhaps if we are trying this naive approach of always calculating what would be best, our biases will creep in, and so we’ll tend to distribute benefits to people like us instead of to those perhaps who actually would need it more. Indeed, there is a lot of opportunity for that, including your self-serving biases. You might think, “Actually, that nice thing that my friend has would create more happiness if I had it, and so I’m just going to swipe it on the way out the door.”
The concern is that actually there is quite a lot of this self-regarding and in-group bias with people, and so if they were all trying to directly apply this criterion and to treat it as a decision procedure, they probably would do worse than they would do under some other methods. And for a thoroughgoing utilitarian, the best decision procedure is whichever one would lead to the most happiness. If that turns out to be to make my decisions like a Kantian would, if that really would lead to more of what I value, then fine, I don’t have a problem with it.
And so one thing that’s quite interesting is that utilitarianism, in some sense, is in less conflict than people might think with other moral theories, because the other moral theories are normally trying to provide a way of making the decisions. Whereas utilitarianism is potentially open to agreeing with them about their way of making decisions, if that could be grounded in the idea that it produces more happiness.

Moral trade

Toby Ord: Suppose you’re completely certain, and you think only happiness matters. So you’re not worried about the moral uncertainty case, and you’re not worried about this idea that other things might go down in that last 1% of optimisation, because you think this really is the only thing that matters.
Well, at least if you’re interested in effective altruism, then you’re part of a movement that involves people who care about other things, and you’re trying to work with them towards helping the world. And so this last bit of optimisation that you’re doing would be very uncooperative with the other people who are part of that movement.
So this can be connected to a broader idea that I’ve written about called moral trade, where the idea is that, just as people often exchange goods or services in order to make both of them better off — this is the idea that Adam Smith talked about: if you pay the baker for some bread, you’re making this exchange because you both think that you’re better off with the thing the other person had — you could do that not just about your self-interested preferences, but with your moral preferences. And in fact, the theory of trade works equally well in that context.
For example, suppose there were two friends, one of whom used to be a vegetarian but had stopped doing it because maybe they got disillusioned with some of the arguments about it. But they’d kind of gone off meat to some degree anyway, and so it wouldn’t be too much of a burden if they went back to being a vegetarian. That person cares a lot about global poverty, and their friend cares about factory farming and vegetarianism. Well, they could potentially make a deal and say, “If you go back to being a vegetarian, I will donate to this charity that you keep telling me about.” They might each not be quite willing to do that on their own moral views, but to think that if the other person changed their behaviour as well, that the world really would be better off.
And you can even get cases where they’ve got diametrically opposed views. Perhaps there’s some big issue — such as abortion or gun rights or something — where people have diametrically opposed positions, and there are charities which are diametrically opposed. And they’re both thinking of donating to a pair of charities which are opposed with each other. And then maybe they catch up for dinner and notice that this is going to happen. And they say, “Hang on a second. How about if instead of both donating $1,000 to this thing, we instead donate our $2,000 to a charity that, while not as high on our list of priorities for charities, is one that we actually both care about? And then instead of these effects basically cancelling out, we’ll be able to produce good in the world.”
So that’s the general idea of moral trade. And you can see why the moral trade would be a good thing if it’s the case that even though people have different ideas about what’s right, and these ideas can’t all be correct, if they’re generally, more often than not, pointing in a similar direction or something — such that when we better satisfy the overall moral preferences of the people in the world — I think we’ve got some reason to expect the world to be getting better in that process. In which case, moral trade would be a good thing. And it’s an idea that can also lead to that kind of behaviour where you don’t do that last little bit of maximising.

The value of personal character and integrity

Toby Ord: So the reason that effective altruism focuses so much on impact and doing good — for example, through donation — is that we’re aware that there’s this extremely wide variation in different ways of doing good, whether that be perhaps the good that’s done by different careers or how much good is done by donating $1,000 to different charities.
And it’s not as clear that one can get these kinds of improvements in terms of character. So if you imagine an undergraduate, just finishing their degree, about to go off and start a career. If you do get them to give 10 times more than the average person, and to give it 10 times more effectively, they may be able to do 100 times as much good with their giving, and that may be more value than they produce in all other aspects of their life. But if you told them to be a really good character in their life, and that was the only advice, and you didn’t change their career or anything else, it’s not clear that you could get them to produce outcomes like that. And then there’s a question about how much goodness does the virtue create or something, but it doesn’t seem like it comes from the same kind of distribution. It’s unlikely that there’s a version of me out there with some table calculating log-normal distributions of virtue or something like that.
And I think that’s right. But how I think about it is that, ultimately, in terms of the impact we end up having in the world, you could think of virtue as being a multiplier — not by some number between 1 and 10,000 or something with this huge variation, but maybe as a number between -1 and +1 or something like that, or maybe most of the values in that range. Maybe if you’re really, really virtuous, you’re a 3 or something.
But the fact that there is this negative bit is really relevant: it’s very much possible to actually just produce bad outcomes. Clearly, Sam Bankman-Fried seems to be an example of this. And if you’ve scaled up your impact, then you could end up with massive negative effects through having a bad character. Maybe by taking too many risks, maybe by trampling over people on your way to trying to achieve those good outcomes, or various other aspects.
Rob Wiblin: Yeah. So the point here is that even though virtue in practice doesn’t seem to vary in these enormous ways — in the same way that, say, the cost effectiveness of different health treatments might, or some problems being far more important or neglected than others — all of the other stuff that you do ends up getting multiplied by this number between -1 and 1, which represents the kind of character that you have, and therefore the sort of effects that you have on the project that you’re a part of and the people around you.
And maybe we’ll say a typical level of virtue might be 0.3 or 0.4, but some meaningful fraction of people have a kind of character that’s below zero. Which means that usually, when those people get involved in a project, they’re actually causing harm, even though people might not appreciate it — because they’re just inclined to act like jerks, or they lie too much, or when push comes to shove they’re just going to do something disgraceful that basically sets back their entire enterprise. Or there might be various other mechanisms as well. And then obviously it’s very clear that going from -0.2 to 2 is extremely important, because it determines whether you have a positive or negative impact at all.
Toby Ord: Yeah. And another way to see some of that is when you’re scaling up on the raw impact. For example, suppose you’ve noticed that when founders set up their companies, some of these companies end up making a million dollars for the founders, some make a billion dollars: 1,000 times as much. This is one of these heavy-tailed distributions. And then if you’ve got a person with bad character, the amount of damage they could do with a billion-dollar company is like 1,000 times higher as well as the amount of good they could do with it is 1,000 times higher.
So it’s especially important if someone is going to go and try to just do generically high-impact things that have a positive sign on that overall equation and not a negative one. Another way to look at that is when you have something like earning to give, because there’s an intermediate step where it turns into dollars — and dollars are kind of morally neutral depending on what you do with them, or at least morally ambiguous, as opposed to it directly helping people — then there’s more need to vet those people for having a good character and before joining their project or something like that.

How Toby released some of the highest quality photos of the Earth

Toby Ord: I’d been looking at some beautiful pictures of Saturn by the Cassini spacecraft. Amazing. Just incredible, awe-inspiring photographs. And I thought, this is great. And just as I’d finished my collection of them and we had a slideshow, I thought, I’ve got to go and find a whole lot of the best pictures of the Earth. The equivalent, right? Like fill a folder with amazing pictures of the Earth.
And the pictures I found were nowhere near as good. Often much lower resolution, but also often JPEG-y with compression artefacts or burnt-out highlights where you couldn’t see any details in the bright areas. All kinds of problems. The colours were off. And I thought, this is crazy. And the more I looked into it, I got a bit obsessed in my evenings downloading these pictures of the Earth from space.
I eventually had a pretty good idea of all of the photographs that have been taken of the Earth from space, and it turns out that there aren’t that many spacecraft that have taken good photos. Very few, actually.
If you think about a portrait of a human, the best distance to take a photo of someone is from a couple of metres away. Maybe one metre away would be OK, but any closer than that, they look distorted. And if you go much farther, then you won’t get a good photo; they’ll be too small in the shot. But the equivalent is partway from the Earth to the moon. Low Earth orbit, where the International Space Station is, is too close in: it’s the equivalent to being about a centimetre away from someone’s face. And the moon is a bit too far out, although you can get an OK photograph.
And so it turned out that it was mainly the Apollo programme, where they sent humans with extremely good cameras, with these Hasselblads, up into space, and they trained them in photography. Their photos are just way better than anything else that’s been done, and it’s just this very short period, a small number of years. And I ended up going through all — more than 15,000 — photographs from the Apollo programme and finding the best ones of the Earth from space.
And then I found that there were these archives where people had scanned the negatives, and even then some of the scans were messed up. Some of them were compressed too badly, some of them had blown-out highlights, some of them were out of focus. And for every one of my favourite images, I went and found the very best version that’s been scanned.
And then I found that, surprisingly, using Aperture, a program for fixing up photographs, that I could actually restore them better than had been done before. I was very shocked that all of a sudden my photograph of the blue marble was as good or a little bit better than the one on Wikipedia or the NASA website. And for other photographs that were less well known, I could do much better than had been done before.
And I eventually went through and put in a lot of hours into creating this really nice collection, and made a website for them called Earth Restored, which you can easily find, where you can just go and browse through them all.

Articles, books, and other media discussed in the show

Toby’s work:

The precipice: Existential risk and the future of humanity
Opening session at EA Global Bay Area 2023
Opening session + Fireside chat at EA Global San Francisco 2022 (with Will MacAskill)
The moral imperative towards cost-effectiveness in global health
Global poverty and the demands of morality
Moral uncertainty (with Will MacAskill and Krister Bykvist)
Moral trade
How to be a consequentialist about everything
Beyond action: Applying consequentialism to decision making and motivation (Toby’s PhD thesis)
Earth Restored — Toby’s restored images of Earth from the Apollo mission
See all of Toby’s work on his website

Moral philosophy and effective altruism:

How rich am I? — a tool for comparing global income distribution and the effects of our donations
How much do solutions to social problems differ in their effectiveness? A collection of all the studies we could find. by Benjamin Todd
Famine, affluence and morality by Peter Singer
Consequentialism on the Internet Encyclopedia of Philosophy
Expected value: how can we make a difference when we’re uncertain what’s true? by Benjamin Todd
On what matters by Derek Parfit
Integrity for consequentialists by Paul Christiano
Uneasy virtue by Julia Driver

Developments and challenges in the effective altruism community:

Bankruptcy of FTX — Wikipedia page
How 80,000 Hours has changed some of our advice after the collapse of FTX
Going infinite: The rise and fall of a new tycoon by Michael Lewis (forthcoming book on Sam Bankman-Fried)
EA is about maximization, and maximization is perilous by Holden Karnofsky
EA and the current funding situation by Will MacAskill (May 2022)
Free-spending EA might be a big problem for optics and epistemics by George Rosenfeld on the EA Forum

Developments and challenges in artificial intelligence:

What could an AI-caused existential catastrophe actually look like? by Benjamin Hilton
What failure looks like by Paul Christiano
Statement on AI risk signed by AI scientists and notable figures
Secretary-General’s remarks to the Security Council on Artificial Intelligence
The Elders urge global co-operation to manage risks and share benefits of AI
UK to host first global summit on artificial intelligence
Tech entrepreneur Ian Hogarth to lead UK’s AI Foundation Model Taskforce — and see also Ian Hogarth’s article in the Financial Times: We must slow down the race to God-like AI
Initial £100 million for expert taskforce to help UK build and adopt next generation of safe AI
The book I wish every policymaker would read — an interview with Jennifer Pahlka on The Ezra Klein Show about how policy implementation goes awry in the US government
Introducing Superalignment from OpenAI (also discussed in detail in our podcast episode with Jan Leike)
A conversation with Bing’s chatbot left me deeply unsettled by Kevin Roose
“Meaningful harm” from AI necessary before regulation, says Microsoft exec by Ashley Belanger
Watch as Sydney/Bing threatens me then deletes its message by Seth Lazar
Bing’s chatbot compared an Associated Press journalist to Hitler, and said they were short, ugly, and had bad teeth by Pete Syme
Man dies by suicide after talking with AI chatbot, widow says by Chloe Xiang

Other 80,000 Hours podcast episodes:

Transcript

Table of Contents

1 Cold open [00:00:00]
2 Rob’s intro [00:00:55]
3 The interview begins [00:04:03]
4 The fall of FTX [00:10:59]
5 The history of effective altruism [00:30:54]
6 The pros and cons of utilitarianism [00:39:58]
7 The original vision for effective altruism [00:53:46]
8 The harm that comes from going all-in on one theory [01:07:38]
9 Moral trade [01:21:56]
10 Global consequentialism [01:32:44]
11 Figuring out the value of personal character and integrity [01:53:56]
12 How Toby now feels about effective altruism [02:12:54]
13 Toby’s thoughts on AI progress [02:17:14]
14 Responsibility among AI labs [02:30:33]
15 Problems in infinite ethics [02:50:06]
16 How Toby released some of the highest quality photos of the Earth [02:59:21]
17 Rob’s outro [03:04:47]

Cold open [00:00:00]

Toby Ord: Ultimately, in terms of the impact we end up having in the world, you could think of virtue as being a multiplier — not by some number between 1 and 10,000 or something with this huge variation, but maybe as a number between -1 and +1 or something like that, or maybe most of the values in that range. Maybe if you’re really, really virtuous, you’re a 3 or something.

But the fact that there is this negative bit is really relevant: it’s very much possible to actually just produce bad outcomes. Clearly, Sam Bankman-Fried seems to be an example of this. And if you’ve scaled up your impact, then you could end up with massive negative effects through having a bad character. Maybe by taking too many risks, maybe by trampling over people on your way to trying to achieve those good outcomes, or various other aspects.

Rob’s intro [00:00:55]

Rob Wiblin: Hey listeners, Rob here, head of research at 80,000 Hours.

Interviews with philosopher and mathematician Toby Ord are always super popular, and to be frank, he makes my job as a host really easy, so I’m always looking for an excuse to bring him back on.

And I got one when on YouTube I saw his keynote presentation to an Effective Altruism Global conference that happened back in February.

Toby goes over three little-discussed reasons from philosophy why it’s a bad idea to go all-in on pursuing any particular goal at the expense of everything else, which are obvious to me in retrospect but I hadn’t really thought enough about.

As he explains in the conversation, Toby had looked at this topic before, but was motivated to dig into it and write about it again because of the possibility that Sam Bankman-Fried, who stands accused of committing serious fraud while CEO of the cryptocurrency exchange FTX, was motivated to break the law by his desire to give away as much money as possible to worthy causes.

In the conversation we go over:

The rise and fall of FTX and some of its impacts
What Toby hoped effective altruism would and wouldn’t become when he helped to get it off the ground
What utilitarianism has going for it, and what’s wrong with it in Toby’s view
The over-optimisation argument against being fanatical about any particular goal
The moral trade argument against going all-in on any particular moral theory
How so-called global consequentialism, which Toby happened to write his thesis on, can help explain why even utilitarianism doesn’t recommend doing radical and crazy stuff
And a mathematical model of how personal integrity can be insanely important even though it doesn’t vary nearly as widely as, say, the neglectedness of global problems

Toby has also been visiting the AI lab DeepMind for the better part of a decade, so at the end I get his thoughts on which AI labs he thinks are acting responsibly, how he rates the behaviour of each of the main actors, and how having a young child affects his feelings about AI risk.

I then couldn’t resist getting a quick reaction from Toby to the argument we’ve heard on the show multiple times earlier in the year: that problems in infinite ethics present a fundamental and inescapable problem for any theory of ethics that aspire to be fully impartial.

And finally, we hear how it could possibly be that Toby ended up being the source of the highest quality images of the Earth from space.

All right, the email for the show if you’d like to send us feedback is always [email protected].

And now I bring you Toby Ord.

The interview begins [00:04:03]

Rob Wiblin: Today, I’m speaking with Toby Ord, a mathematician turned moral philosopher at Oxford University. His work focuses on the big-picture questions facing humanity. His early work explored the ethics of global health and global poverty, and this led him to create an international society called Giving What We Can, whose members have pledged now over $3 billion to the most effective charities that they can find.

In those early years of Giving What We Can, from 2007 through 2013, he was perhaps the prime mover in the emergence of effective altruism as an intellectual and social movement — that people understood it by that term, and viewed it as a coherent thing.

In 2021, he published the book The Precipice: Existential Risk and the Future of Humanity, which was very well received and read by policymakers around the world — ultimately being referenced extensively in UN Secretary-General António Guterres’s plan for his second term, titled Our Common Agenda. We talked about that book back when it was being conceived for episode #6: Toby Ord on why the long-term future of humanity matters more than anything else, and then again when it came out for episode #72: Toby Ord on the precipice and humanity’s potential futures.

Over the years, Toby has advised many groups, including the World Health Organization; the World Bank; the World Economic Forum; the US National Intelligence Council; and the UK Prime Minister’s Office, Cabinet Office, and Government Office for Science.

Thanks for coming back on the show, Toby.

Toby Ord: It’s wonderful to be back.

Rob Wiblin: I hope to talk about the importance of personal integrity and what key things we’ve learned about AI since The Precipice came out three years ago. But first, what are you working on at the moment and why do you think it’s important?

Toby Ord: Well, I don’t know if you’ve noticed, but quite a lot of stuff’s been going on with AI recently.

Rob Wiblin: I think we’ve noticed.

Toby Ord: Yeah. Maybe some large fraction of my time is even just keeping up with what’s going on. In the first part of this year, there’d been a lot of action on improvements in AI capabilities, where we saw Bing and then GPT-4 and a bunch of other things, the open source movement in AI really taking off. And then more recently, there’s been real developments in the policy side. The Overton window on what’s acceptable to think about AI progress and the risks that it might bring has really shifted in the last month or so.

So I’ve been doing a lot of thinking about what would be good policy proposals going forward, and how in general the landscape has shifted and how that should change our strategy.

Rob Wiblin: What sort of things would people have looked askew at you for saying a year ago, that now are very much on the table?

Toby Ord: Well, I think the big one is that AI poses an existential risk to humanity. That’s something where there’s still a bunch of debate about it. But a couple of weeks ago, there was a statement signed by the heads of all three of the AGI labs and many other people on their teams, as well as a real who’s who of scientists working on AI and many other people from other walks of life. And the statement was just very short. It was just saying that we believe that AI poses a risk of human extinction and should be a global priority. And so that was a very clear statement.

And actually, not many people noticed, but a couple of days later, The Elders — this group of former heads of state, and leaders of the UN and the World Health Organization, and other things — they put out their own statement, which said that they thought that AI posed an existential threat to humanity. And that was signed by such non-tech-bros as Mary Robinson, former president of Ireland, former president of Oxfam; Ban Ki-moon, former Secretary-General of the United Nations; Gro Harlem Brundtland of the WHO. And then after that, we had the current Secretary-General, who didn’t exactly say it was an existential risk; he said people are suggesting that it is an existential threat to humanity and we need to take that seriously — which is probably about as close as we’ll hear from him on this topic.

So things have really shifted. And then we’ve even had the prime minister here in the UK, Rishi Sunak, meeting with AI leaders and also proposing a summit to talk about national and international governance of AI because of these risks that it could produce.

Rob Wiblin: Has the response across the board exceeded your expectations, or would you wish it had gone even better?

Toby Ord: Well, I mean, it’s hard to get more groups of senior people — both within AI, the scientists and the executives, and then also in terms of prominent world figures — to actually make explicit statements. So that definitely has exceeded expectations. And it’s moved very quickly.

That said, partly because it’s moved so quickly, I think it’s left a bunch of people behind. There are people who are saying, “Wait, what?” The conversation is a bit chaotic. There’s especially a strand of people saying, “Well, they would say that, wouldn’t they?” which particularly puzzles me. I would think that in general, executives and CEOs of organisations wouldn’t say that their product could kill you and your family and that we need to be careful about that and regulate ourselves. And even if it was some kind of seven-dimensional chess, it wouldn’t be at the same time that all of these other world leaders and scientists make the same statement about their potential dangers.

I think that we would have loved it if, when the first evidence started coming out that fossil fuels could cause serious environmental problems or that tobacco could cause serious health problems, if the leaders of those companies said, “OK, this is an issue, and we’re going to need to regulate and deal with it and be regulated.” That’s exactly what we hope that these people would say. So I think that’s a much more sensible take.

Rob Wiblin: Yes. Overall, I’ve been impressed with the number of people who have very quickly cottoned on to what a serious issue this is, and how many different material risks there are that we don’t yet have a good grasp of, and that we haven’t yet taken any measures to mitigate. But there definitely has been a lot of very odd reactions. I think that’s one that definitely stands out as completely baffling: people saying, “Well, of course Sam Altman would say that his product might kill everyone on Earth, because that’s a good way to build hype and raise more money.” It’s hard for me to understand what is going on inside people’s heads that would just say something so batty from my point of view.

Toby Ord: Yeah, I think that this will be a brief window where that argument seems sensible to some people.

The fall of FTX [00:10:59]

Rob Wiblin: Yeah. Well, we’re going to come back to AI later in the conversation. But first off, I wanted to talk about effective altruism as a social movement, and some insights from moral philosophy that might be particularly useful for people who are trying to do a lot of good with their career.

So, years ago, you were really involved in the emergence of effective altruism as both a stream of intellectual thought and as a group of people trying to take action on those ideas. I guess you’ve taken more of a backseat on it in recent years, in order to free up time for other priorities like working on artificial intelligence issues.

But you were asked to give the opening address to the main conference of the effective altruism movement called Effective Altruism Global when it was held in the Bay Area back in February, in order to address the extraordinary ups and downs that that group had experienced over the course of 2022. I really loved that talk. I wanted to go through quite a lot of the things that you said in it here today, and maybe dig into a few of them. Basically, you used the events of 2022 to go through a series of important lessons from the philosophy of doing good that actually transcend any specific case and are useful to keep in mind at all times, which were made particularly salient by things that had happened.

But we should do a little bit to set the scene. Can you give a short summary of the ups and downs that you were reacting to in your talk?

Toby Ord: Yeah, I was especially referring to a series of events connected to FTX, this company set up by Sam Bankman-Fried, and his attempt using this vehicle to make a lot of money in order to give it away and thereby do a lot of good — something that he certainly stated that he saw as part of effective altruism and its mission.

Even in the first half of the year though, before all of these scandals broke out, it was a wild ride. The amount of funding going into effective altruism and the types of projects that we care about was extreme. There was an attempt through the associated FTX Future Fund to really scale up the giving very quickly. And I understand the motivation behind that: ultimately, if you do these things slowly, then you’re doing less good than you could have perhaps during a really critical period in humanity’s development. And so getting to scale quickly could really matter.

But it was also a pretty white-knuckle ride, where just every week or month, amounts of money that you hadn’t seen or heard of before were being awarded for the best blog post on something. And all of these prizes and rewards, they caused a number of issues. I mean, they created distortions where I could have probably made more money by switching to becoming a blogger instead of a philosopher in order to try to scoop up some of these awards. I didn’t do that, but some people might have actually moved towards whichever particular area had announced a prize first.

And then there was also this strange problem, where in the early days of effective altruism there was no money in it. If you could get a job in an EA organisation, you would earn substantially less than market rates. And so you knew that the people who are working on these issues with you weren’t in it for the money, because there wasn’t any. Whereas all of a sudden there was so much money that it was starting to become hard to know if you met new people in the area, what had brought them in, and did they have the same moral motivations or not. So there were a bunch of these problems.

Another one was politics: all of a sudden, there was a whole lot of money being invested in politics and associated with effective altruism. I thought that was a terrible idea. We’d tried so hard with effective altruism to not associate it with party politics, because it should be the kind of idea that everyone can get behind. No side has a monopoly on the idea of using your career or your money to help others. And then all of a sudden there was a risk that it would be perceived as politically biassed from this.

So that upwards trajectory was really hard to deal with, I think, for a lot of us.

Rob Wiblin: Yeah, it was a very strange time. I guess I hadn’t really been paying almost any attention to FTX or Sam Bankman-Fried until quite late in 2021, when it suddenly burst onto the scene. I guess the valuation must have gone up enormously. And then there was also a decision by some people to try to start deploying a whole bunch of money really very rapidly. So there was a big increase in the amount of funding that philanthropists had effectively earmarked to work on particular longtermist projects, or particular interventions associated with effective altruism. That meant that it seemed like the amount of money that people were trying to deploy, maybe per month, had tripled or something like that. So it was a “vertiginous rise,” as you say in your talk, and it had this very freewheeling sense to it from January through November of 2022.

Yeah. Do you want to talk a little bit about the fall as well, for people who haven’t been paying attention?

Toby Ord: Yeah, I was going to say it really put it into the news, although even before the fall, that’s another one of the aspects: the rise was FTX being everywhere in the news. And then also people hearing about effective altruism through this FTX idea: one particular person’s attempt at earning to give, and a very unscrupulous attempt being the thing. And even before the fall, it was associated with crypto, which was an area that while some people, I’m sure, have entirely noble intentions, there were many scammers. And so it was a problematic area to be intimately associated with.

Then we started to hear news that FTX had gone bankrupt. And then as more information came out, it seemed that it wasn’t just that it had fallen to 10% of its valuation — it had gone to negative. And it was a bit mysterious as to how this was supposed to have happened with a crypto exchange. It shouldn’t really be able to lose all of its money like this.

And then as more information came out, it seemed — and it currently does still seem — that there was a problem involving these two branches of the company. I guess they’re technically two separate companies, Alameda and FTX, both started by Sam Bankman-Fried — where Alameda was a trading company, trading on crypto, and FTX was an exchange. And they had a kind of special secret relationship together, where FTX gave Alameda better access to the exchange than it gave to the other people on it. But worse than that: in some complex, messy way, effectively the trades that Alameda was making appear to have been backed, as security, by the investments that all of the people who were using FTX had deposited. So even though these deposits were technically meant to be completely safe, because they were meant to be securely encrypted, ultimately it looks like customers will lose a very large fraction of their deposits.

Although that’s not fully known yet. In fact, there’s a lot of features of this that still aren’t fully known, including details of what exactly happened and also details of the actual motivations. Was it just Sam or did other people know about this? I’m sure much more of this will come out. There’s still going to be a major court case on this, and a number of books investigating it and so on, but it’s a massive financial scandal and catastrophe for the depositors who had invested in this.

So that is a pretty shocking thing, actually, to be reading about in the news, even if it wasn’t connected to a social movement that you helped to found — where it’s rather more shocking and devastating, really, that someone would do these things which appear to have been both illegal and immoral. And even if somehow it had worked, and their raiding the funds of the depositors helped them pay for one more trade which managed to get them out of the red and save the company, it still would have been illegal, and I think immoral, to do this.

But because it also, on top of that, it failed, it also caused vast amounts of damage to the depositors, and then ultimately a whole lot of collateral damage in other areas — including the idea of effective altruism, and also EAs: people who try to take these ideas seriously in their life. All of a sudden, people were concerned about them, even though they had nothing to do with this case in most cases. And then all of the causes that these people were devoting their lives to try to help with, all of a sudden lost out on money that was earmarked for them and so forth.

So massive amounts of destruction, both to the general public and also to this movement that aimed at trying to actually just do good in the world.

Rob Wiblin: Yeah, I interviewed Sam Bankman-Fried, the CEO of FTX, back in early 2022, and I also put out some comments about the collapse of FTX in late November last year. So I expect quite a few subscribers to the show will at least have a passing knowledge of what went on. For people who are interested to get an update on what has been learned since then, if anything, we’ll stick up a link to the Wikipedia entry on the collapse of FTX so people can do a little bit of digging, I guess.

This interview isn’t about trying to piece together all of the facts of that case, because we have no particular expertise, and I’m sure other people will do a much better job. But back in your talk in February at Effective Altruism Global, you said that we don’t really know what fundamentally motivated Sam and some of his colleagues to do what they did; although we don’t know the specifics, we know it was probably illegal and certainly immoral actions were taken. That’s one reason why today we’re going to try to learn lessons that aren’t specific to any particular claims about precisely what went wrong in the FTX case.

But do you think it’s still the case that today, when we’re recording in June, we don’t really know the motives? Or I guess what psychologically was going on that caused people to take these actions?

Toby Ord: No, I certainly don’t know, and I think very few people actually do. But I can speculate a little bit.

Rob Wiblin: Yeah. What are some of the options?

Toby Ord: Yeah, I mean, I suppose I should have added the word “allegedly” to every sentence I said before.

Here are some things that we do know: Sam, even before he’d ever heard of effective altruism, had been brought up thinking about the moral philosophy of utilitarianism by his parents, and was very dedicated to this. We can get into a bit more about exactly what utilitarianism is later, but it’s a moral philosophy that certainly has some similarities with EA. And so if he’d never heard of EA, he may well have just still followed this view. And so to the extent to which part of the concern is that he took maybe unnecessarily risky actions, or that he treated people merely as a means, or that he was prepared to break the law if it meant that he could achieve the greater good — all of these kinds of things that people think might be connected to EA — he was already committed to a theory that actually held stronger versions of those things than effective altruism does.

And there’s also concerns of connections to longtermism. And I think there is actually an even weaker case, really, that it’s very connected to longtermism. Ultimately, Sam was already very concerned about factory farming of animals and the horrible suffering and injustice involved in that. And that’s already such a big area that could take so much funding trying to actually remedy those problems that he already had reason to take big risks, and to try to be almost insatiable for money in order to try to fix those problems.

So I think that doesn’t mean that EA is off the hook or something, that it had no connection to this. One of them is that the EA community was a supportive community for someone who had non-standard views about doing good in the world. So maybe if everyone he’d talked to had just had a kind of blank stare and thought he was crazy, then maybe he actually wouldn’t have gone through with these things. Whereas if there was a community that was more receptive and thought, “If you can make 10 times as much money, maybe that’s 10 times as helpful and a really big deal” that he then saw some support.

And then there’s also the idea that he traded on this good reputation being vouched for by people in this community or something like that. I think that there’s something to that as well. That’s an aspect where I think EAs had been kind of sucked in on this. At least to the extent to which it was premeditated, or that he always had this idea he might do these things. If he was just trying to walk the right path right up until the end, and then did bad stuff, then maybe the sucked-in story doesn’t make sense either.

But I do feel that I always felt a little bit uneasy about the things that I would hear with Sam Bankman-Fried. Not uneasy enough to think that he would do something like this, just to be clear. It’s more that I thought he was the kind of person who would cut corners when he needed to, quite possibly after thinking it through and so on. But a “move fast and break things” kind of person, let’s say, as opposed to a “rob lots of people of their life savings” kind of person. But even a “move fast and break things” kind of person who seems a bit cavalier about all of these things, perhaps that was enough evidence to be substantially more cautious, regardless.

Rob Wiblin: I guess one thread on this question of the motivation thinks about it from a strategic philosophy point of view, perhaps. Another angle you could take is more around personality, I suppose — where some people are just more risk-taking than others, and just might find it unacceptable, the possibility of their business going bankrupt, and would be willing to take very extreme risks in order to try to prevent that happening, by disposition.

To me, that has always seemed at least like it’s going to be a very important part of the story. Because you can imagine so many people in a similar position, who might have similar philosophical commitments about doing good, who, just because of their personalities, would never contemplate appropriating money illegally in the way that it’s alleged that Sam did, or just taking on the kinds of crazy risks that it seemed were occurring at FTX, even potentially within the law, just in terms of, like you say, the “move fast and break things” approach to business.

Also, we’ve seen similar scandals in other financial organisations before, and there it seems like the key issues were pride, shame, recklessness, perhaps ill-judged rapid decisions — rather than anything particular about the moral philosophy of the people involved.

Toby Ord: Yeah, I think that’s quite plausible. There’s evidence that there was a culture of taking stimulants there — which I think is actually quite common in hedge funds, but perhaps to a higher degree at FTX — and these can interfere with people’s normal risk attitudes, making people more risk-loving. And yeah, I think that one could explain a lot of this just with appeal to these regular emotions of pride and shame. Ultimately, he was riding high and had built a company, becoming a billionaire at a very young age and so on. And then feeling that he had so much money to be able to give to these good causes, and that a bunch of people trying to do good were looking up to him as someone who had really succeeded in earning so that they could then do a lot of good with that money, and that if he went back down to zero, then people wouldn’t be looking up to him anymore.

So maybe that’s the kind of thing that led him to think that he could try one last double or nothing on all of this. And then perhaps when it seemed like they’d actually gone into the red, that some feeling of shame of being caught at that or ending at that stage, and pride, might have meant why they didn’t just wrap it up at that point. I think especially this is for the Alameda side of things, which I think just, as far as I can understand, could have and should have been allowed to fail at that point, rather than taking additional bets and bringing FTX down with it.

So yeah, I do feel that it’s actually quite hard to pull this all apart and to see exactly what’s gone on. And I should add that with the risk-taking behaviour, it’s also not clear that even utilitarianism would endorse these kinds of risk. He seems to have been making just bad bets. Bets that it’s not just that they’re positive expected value, but with huge variance — but actually that they’re bets with negative expected value. Ultimately, effective altruism was in a situation where it had surprisingly high amounts of funding already, such that additional funding has this diminishing marginal value at helping with the causes that we care about. And so then taking big risks with the entire future of that movement in order to try to increase that amount of money a bit more actually just seemed crazy to me.

Rob Wiblin: Yeah, that’s one reason why instinctively I’m more inclined to reach for explanations that centre on personality or perhaps error, or actually just lacking information or making judgements very quickly, perhaps when someone’s sleep deprived. Because it’s so hard to explain how the actions that were taken could be justified from the point of view of trying to do as much good as possible. It seems like unless I’m really misunderstanding how the decision looked to them, no sane person who shared the goal of trying to, say, reduce extinction [risk] would have recommended it, no external party ever would have said, “Yes, this is the step that you would take.” I mean, it’s almost funny to say it because it’s so obvious, but yeah.

Toby Ord: Anyway, yeah. I think you could well be right. And to some extent it might be galaxy-braining this whole thing to think he is very unusual in having these moral beliefs and he’s unusual in having committed this large financial fraud and so they must be connected. It could be indeed that the simplest story involves these emotions and also just error. Just he really screwed up.

Rob Wiblin: Yeah, it’ll be interesting to see what further evidence comes out about this later in the year. I think, as you say, there’s some investigative journalists — is it Michael Lewis who’s producing a book about this? I think he was actually following Sam around at the time. So he might have as good a shot as any at piecing together what on Earth was going on inside people’s heads.

The history of effective altruism [00:30:54]

Rob Wiblin: OK, let’s wind back a bit now, setting aside 2022, and turn back to the period around 2008, when you were really instrumental in getting effective altruism off the ground. How did you find yourself helping to make EA into a thing?

Toby Ord: I guess I’ve always had a lot of empathy for people who are in worse situations than myself. I’ve had a fairly privileged upbringing in a middle-class family in Australia, but I don’t think any of that is unusual. I don’t think I had unusually high empathy for others or something. Maybe a somewhat unusual desire to just be a good person, but again, not totally out of the ordinary.

And ultimately, I think that where things started looking a bit different and more distinctively EA was after I’d left Australia and come to Oxford to study. And one of the essays that we were made to write for my master’s degree was this question of: Ought we to forgo a luxury whenever we can thereby enable someone’s life to be saved? And this was, at first thought, “Yes, obviously.” And then the gears start to turn and then you think, “Oh, hang on…” If that were true, then maybe you could never have a luxury — because each luxury, if you think about how much it might cost to save someone’s life in poor countries, maybe one would keep having to exchange all of these luxuries for saving more lives.

And I realised it was connected to Peter Singer’s famous “Famine, affluence and morality” essay, where he has this idea of the drowning child in the shallow pond and the analogy to international aid. So that all connected with me. And I got to read or reread these famous papers and spend a couple of weeks thinking a lot about that.

Rob Wiblin: Is this when you were doing a PhD in moral philosophy?

Toby Ord: I was doing this inappropriately named BPhil degree, the Bachelor of Philosophy, which is Oxford’s Master’s in Philosophy. And yeah, it made me think about these things again, and really come face to face with it, actually. And I was impressed by Peter Singer, by the fact that he came face to face with these things too, and that while he didn’t quite live up to exactly the standard that he recommends in the piece — where he says you should give everything above a certain minimum bar of income — he was giving a substantial fraction of his income. And he was doing this because of the moral philosophy that he was advocating, and that he had discovered by thinking deeply about it.

And this made me realise that it really was possible, that this wasn’t just a game that academics were playing about, “What obligations do we have? I can prove this obligation. That sounds super strong, that’s really impressive as a paper.” But instead, we might actually have these obligations that we’re discovering, and that this is meaningful and should actually motivate us. And so I found that to be actually quite inspiring.

Rob Wiblin: So that’s the moral philosophy side. But as I understand it, I think a pivotal issue was also starting to engage with the empirical evidence out there about what sort of impacts can we have on the world? Because all of this is good in theory, but if you actually can’t help people in a really big way, then maybe we don’t have these obligations. But this motivated you to look into that, right?

Toby Ord: Yeah, so I’d been keeping track of these things for a while — you know, when you get told, “For only 20 cents, you can save a child’s life” or something like that in some advertising copy. And I knew enough to know that those ones weren’t true. My wife actually helped me out a bit with this. She was a medical student, now a doctor, and she knew that the 20 cents in that case is usually the cost of the vaccine or something: the cost of the actual liquid that goes into the syringe. But then you also have to pay for the syringes. You have to pay for the health workers who get out to the remote villages in order to administer the vaccines.

And then bigger than all of that, you have to deal with the fact that many people won’t get the illness at all, and also that many of the people who get the illness wouldn’t die. And so you need this number called the number needed to treat, where that tells you how many people do you need to vaccinate in order to prevent the suffering from that disease, as an example.

So I knew that some of these numbers didn’t work, but there are other numbers that seem to work and to make sense about very cheap costs for preventing blindness, for example. And so I had noticed that there seemed to be something like a factor of 10,000 between how much we could achieve for ourselves with our money [and others]. And when I say “ourselves,” I’m speaking of people from this kind of privileged background: not so much as to where you fall in the income spectrum in your own country, but particularly people from, say, the UK or Australia or the US, where even the median person, the middle person in the income distribution in those countries, are among the top percent or two in the world income distribution.

I was thinking deeply about these things and then I actually was introduced to Jason Matheny, who had been an architecture student, who had found a book called Disease Control Priorities in Developing Countries, DCP, in his library at university, and it changed his life. And he dropped out of his course and actually went on to help write the second edition of that book.

And that book was all about how cost effective it can be to help people in poor countries — so actually trying to do the science on this and work out, really, how much does it cost to save a life? Or as they more commonly did it, in something like a quality-adjusted life year: so it’s important that we extend people’s lives by a larger amount, not just that they die the next month or something, and also that that extension is in as high a quality of health as possible. And also that there are things that don’t involve saving a life, such as curing blindness, which take the years in someone’s life and make them better.

And so by having this idea of a quality-adjusted life year, the universe of different ways of helping people with their health that you can compare is much larger. So it’s not just like, what’s the most effective group for doing this particular health issue? — say, the most effective group for avoiding HIV — but instead it can be for helping people with health, full stop.

So he showed me this book that he’d helped to write, DCP2, and it blew me away. And coming from a maths background, I was delighted to get the dataset of all of these 108 different ways that they had of improving people’s health. And I could see this actual distribution of how much good these different ways could do per dollar. I found that it was just this huge variation, just really massive. The middle interventions, the median, were at around $300 to give someone a year of healthy life. And that’s amazing. If you think about your own life, if you were going to lose a year of life unless you paid $300, I think for most people listening, it would be a no brainer. Even if it were the case that it was up to 100 times as much as that — $30,000 per year of life — I think a lot of people would pay that. And one way to look at that is they’d prefer to keep $30,000 after tax and gain an extra year of life for every year that they do that, compared to having a salary of $60,000. And so it was like, wow, these middle interventions are about 100 times more effective than we’d be willing to pay for health over here.

But then there were other things that were 10 times as effective as that, and then there were some that were even 10 times as effective as that. And the whole span ranged from things which are about as effective as we’d pay for in the UK through to things which are about 10,000 times more effective. And some of that variation could be explained by error in the methods; maybe some of it’s a bit exaggerated at the ends. But I could also work out that that couldn’t explain the whole thing, because there are examples — the famous example is the eradication of smallpox — which were actually more effective than anything that had been studied in this group. So it’s not that these numbers were unbelievable: even more amazing things had happened. So that was a real breakthrough for me in terms of understanding this.

The pros and cons of utilitarianism [00:39:58]

Rob Wiblin: I think originally, back in those early days, you thought of the intellectual project that you were a part of as being positive ethics, if I recall. Can you explain why you called it that?

Toby Ord: Yeah, this links back to utilitarianism. So I’d done a lot of study of utilitarianism — and actually my BPhil thesis, and ultimately my doctoral thesis as well, were about consequentialism, which is connected.

So utilitarianism is a theory that ultimately says two different things. The first thing it says is that there’s only one thing that matters, morally speaking, which is that the outcomes are as good as possible. And that principle we call “consequentialism”: the idea is that it doesn’t matter what motives you had or things like that, if you can create better outcomes, that’s what matters.

And then the second part is the idea that what makes an outcome good is purely a matter of the total amount of happiness in that outcome. And “happiness” here is understood usually quite broadly, where it means something like positive experience, even if we wouldn’t normally call that experience happiness, and you can subtract off the negative experience or suffering. So it says that what ultimately matters is the total of positive experience in the world. And that’s an idea that’s actually quite broad. You might think that there’s a bunch of things that it doesn’t capture, but when you actually look at those things, many of them do end up producing more positive experience in the world, the things we value.

The utilitarians are a group who think that actually this captures many of our moral intuitions, and that we can ground out the other things we care about in the increased happiness that they produce. So for example, things like equality and freedom tend to lead to more happiness as well.

So that’s utilitarianism. And one thing that was quite special about it is that it says that the positive matters just as much as the negative. So for example, we normally think of ethics, in common sense, in terms of these kinds of “thou shalt not” lists of prohibitions, things you shouldn’t do — you shouldn’t kill, you shouldn’t steal, shouldn’t cheat, shouldn’t lie — rather than what should you do, kind of more positive duties. And the utilitarians, because they had this symmetry between the positive and the negative, they could notice that if you save 10 lives or if you don’t save 10 lives, that’s a really big deal. In fact, according to the utilitarian, maybe that’s equivalent to killing 10 people if you don’t save 10 lives. So it’s huge.

Effectively, these facts about charity and how much we can help, the reason they’re so amazing is if you’re coming from a country that is one of the richest in the world, then it ultimately shouldn’t be that surprising that the very richest people on a global level can do so much with their money to help others.

I didn’t want to say that the utilitarians are right about this and that it’s just as important. Maybe there are also things that matter beyond good outcomes. Maybe there are certain kinds of actions you should never take, such as killing someone, no matter how good the consequences are. But if you don’t have to do any of those things — you don’t have to kill, cheat, lie, steal in order to donate money to charity — and in doing so, you could save, say, one or 10 or 100 lives during the course of your own career, then what I wanted to say was that’s a really big deal. And the utilitarians saw that, but I thought that you don’t have to be committed to all of this other stuff that they say in order to see it too. In fact, everyone should kind of agree.

Rob Wiblin: Yeah. We’ve described one way that utilitarianism is a bit appealing, or at least it’s alert to an issue that maybe other streams of thought in moral philosophy haven’t had much to say about, which is: not just avoiding doing the wrong thing, but also what ways could you make the world better? What are some of the ways that utilitarianism is a combination of unintuitive and unappealing, perhaps?

Toby Ord: I think that there are two types of ways. I think there’s actually a nice characterisation that while utilitarians only care about producing good outcomes — and “the gooder the better” in terms of this — there are two kinds of limits that we normally have in our commonsense conceptions of morality, which utilitarianism doesn’t have.

The most famous one is that there are normally these limitations — we call them “constraints” — on our actions: that there are certain things that you just shouldn’t do no matter how much better you could make the outcome. And actually, people are somewhat unsure about that in extreme cases: if there is some very extreme case, where perhaps through killing someone, you could save your whole city from being killed by a terrorist or something like that. So there maybe are some cases, but generally there’s this idea of a constraint that there are things that you should never or almost never do in order to make the outcome better.

And utilitarians don’t necessarily agree with that. That said, a thoroughgoing utilitarian would notice that there probably are some pretty bad repercussions involved with breaking those rules, which would include reputational effects and a lot of other more subtle flow-on harms from these things. So whenever they’re trying to be really careful, they normally say, “Actually, we wouldn’t recommend breaking any of these rules either, but we think that’s because of these more subtle flow-on consequences and reputational effects.”

Rob Wiblin: Rather than because it’s wrong in itself.

Toby Ord: Yeah, that’s right. But you basically never see them in moral philosophy saying, “Yeah, you should do those things.” That almost never happens.

Then the other type of limit, and the other area where it can give unintuitive answers — maybe that first case actually does give fairly intuitive answers, because in the end they say you shouldn’t do those things — but this other kind of limit is what we might call a freedom or a prerogative. Where usually, we think that it’s not the case that of all the possible things that you could do today, one of them is the optimal one and that you just have to do that optimal one, and anything else is wrong. So that’s what utilitarianism says.

And we tend to think that actually there’s certain areas of your life where you should have freedoms about them, even if it’s not optimal. Let’s say there’s a choice of how many children you’re going to have, let’s say between zero, one, two, and three. We might say this is something where the intuitive theory of morality says not that you have to think about the consequences of each of those numbers of children and choose on those grounds, but rather that you should be free to have any number: procreative freedom. And so, among other things, it might say that if you earn money at a job that you’re also free to spend that money on what you want. Whereas the utilitarian might say, actually no, you’ve got an obligation to use that money on wherever can help people the most. So that’s another way in which it can be unintuitive.

Rob Wiblin: Yeah, it’s maximally extreme in this dimension. Because utilitarianism, as normally construed, would say that there’s one course of action, the very best, that is absolutely mandatory — and everything else, every other deviation from that, is completely prohibited. Which is quite distinctive relative to most other moral philosophies, I think.

Toby Ord: That’s right.

Rob Wiblin: There’s a third way in which utilitarianism conflicts with most people’s intuitions, I’d say, which is just that it says that the only thing that matters is the consequences. And I guess if we’re talking about utilitarianism as one flavour of consequentialism, it says the only thing that matters, the only thing that has moral value, is wellbeing. And so other things like justice or fairness simply are not moral concepts, or at least that they’re not intrinsically valuable. I think many people find that pretty hard to swallow.

Toby Ord: Yeah. I should say that the key aspect is that “Is this intrinsically valuable?” If you actually talk to utilitarians, they often have very strong pro-equality, pro-freedom, even pro-rights stances. But they think that these things are justified in virtue of their follow-on effects on happiness — human happiness, but also happiness perhaps more broadly construed across the animal world.

So it is a bit tricky. But you’ve got examples of people like John Stuart Mill, a very famous utilitarian, arguing for political liberalism — probably the most famous proponent of political liberalism and the freedoms that entails — and also arguing very prominently for women’s rights before the law. And he thought that both things could be founded on the happiness that would be created. You’ve got Jeremy Bentham, another very famous utilitarian, arguing for legal protections for animals and better treatment of prisoners.

Rob Wiblin: In the 18th century, to be clear.

Toby Ord: In the 18th century, yeah. And also decriminalisation of homosexuality, because he thought it was a victimless crime, and so shouldn’t be a crime and so on.

One thing that you can say in general with moral philosophy is that the more extreme theories — which are, say, less in keeping with all of our current moral beliefs — are also less likely to encode the prejudices of our times. So what we say in the philosophy business is that they’ve got more “reformative power”: they’ve got more ability to actually take us somewhere new and better than where we currently are. Like if we’ve currently got moral blinkers on, and there’s some group who we’re not paying proper attention to and their plight, then a theory with reformative power might be able to help us actually make moral progress. But it comes with the risk that by having more clashes with our intuitions, we will end up perhaps doing things that are more often intuitively bad or wrong — and that they might actually be bad or wrong. So it’s a double-edged sword in this area, and one would have to be very careful when following theories like that.

Rob Wiblin: Yeah. So you say in your talk that for these reasons, among others, you couldn’t embrace utilitarianism, but you nonetheless thought that there were some valuable parts of it. Basically, there are some parts of utilitarianism that are appealing and good, and other parts about which you are extremely wary. And I guess in your vision, effective altruism was meant to take the good and leave the bad, more or less. Can you explain that?

Toby Ord: Yeah, I certainly wouldn’t call myself a utilitarian, and I don’t think that I am. But I think there’s a lot to admire in it as a moral theory. And I think that a bunch of utilitarians, such as John Stuart Mill and Jeremy Bentham, had a lot of great ideas that really helped move society forwards. But in part of my studies — in fact, what I did after all of this — was to start looking at something called moral uncertainty, where you take seriously that we don’t know which of these moral theories, if any, is the right way to act.

And that in some of these cases, if you’ve got a bit of doubt about it… you know, it might tell you to do something: a classic example is if it tells the surgeon to kill one patient in order to transplant their organs into five other patients. In practice, the utilitarians tend to argue that actually the negative consequences of doing that would actually make it not worth doing. But in any event, let’s suppose there was some situation like that, where it suggested that you do it and you couldn’t see a good reason not to. If you’re wrong about utilitarianism, then you’re probably doing something really badly wrong. Or another example would be, say, killing a million people to save a million and one people. Utilitarianism might say, well, it’s just plus one. That’s just like saving a life. Whereas every other theory would say this is absolutely terrible.

The idea with moral uncertainty is that you hedge against that, and in some manner — up for debate as to how you do it — you consider a bunch of different moral theories or moral principles, and then you think about how convinced you are by each of them, and then you try to look at how they each apply to the situation at hand and work out some kind of best compromise between them. And the simplest view is just pick the theory that you’ve got the highest credence in and just do whatever it says. But most people who’ve thought about this don’t endorse that, and they think you’ve got to do something more complicated where you have to, in some ways, mix them together in the case at hand.

And so while I think that there is a lot going for utilitarianism, I think that on some of these most unintuitive cases, they’re the cases where I trust it least, and they’re also the cases where I think that the other theories that I have some confidence in would say that it’s going deeply wrong. And so I would actually just never be tempted in doing those things.

It’s interesting, actually. Before I thought about moral uncertainty, I thought, if I think utilitarianism is a pretty good theory, even if I feel like I shouldn’t do those things, my theory is telling me I have to. Something along those lines, and there’s this weird conflict. Whereas it’s actually quite a relief to have this additional humility of, well, hang on a second, I don’t know which theory is right. No one does. And so if the theory would tell you to really go out on a limb and do something that could well be terrible, actually, a more sober analysis suggests don’t do that.

The original vision for effective altruism [00:53:46]

Rob Wiblin: Yeah. As I understand it, the vision originally was that this positive ethics, which I guess eventually came to be called effective altruism, would basically take the parts of utilitarianism that were largely uncontroversial and then basically just scrap all of the parts of it that seemed dangerous or highly controversial. What was that picture?

Toby Ord: Yeah. So here I think we could isolate something like two different principles, which utilitarianism clearly sees and strongly endorses, but we’re just going to take those two principles. We’re not going to take anything else from the theory. And those principles are that doing good — say, saving a life — really matters. So from a moral point of view, a key part of living a moral life is to do things like helping others, if you really can. So that was one part, and I think that ultimately everyone can endorse that. If there was someone whose moral theory said, “I just can’t endorse that; helping others is just irrelevant,” I would look askance at them.

Rob Wiblin: Yeah. And to be clear, we’re not saying it’s mandatory, merely that it would be a good thing, all else equal, if you could provide a massive welfare benefit to someone.

Toby Ord: Yes. And if it turns out that you knew that you could save 100 lives over the course of your own life and yet you just didn’t, that there would be something that’s seriously missed there. That’s the idea. Or if you did it, that might be one of the most significant aspects of your life from a moral point of view.

So that’s the first idea. And then the second idea is one of scale: to say that saving 10 lives is a 10 times bigger deal than saving one life; saving 100 is 10 times bigger still. So sometimes, when it comes to charity, the technical term that philosophers use is that it’s “supererogatory” — which means that you don’t have a duty to do it; it’s in some other realm. And things in that realm, they have a tendency to think it doesn’t really matter which one of them that you do. Whereas I’m saying, actually, it does matter a lot.

And I think you can miss that if you’re thinking about ethics from the perspective of the agent, the person who’s making the decisions. For example, if you think about the idea of the vow of poverty that mediaeval monks used to take, that was an idea about ridding yourself of these problematic material possessions and the corruption of your soul that they could produce — as opposed to an idea about helping others as much as you can, so that you need to end up in the situation of poverty yourself, a kind of Singerian case for it.

It wasn’t like that. Once you have a focus on the others that you’re helping — and maybe not all your focus should be there, but a substantial amount — then you can imagine these people who would die but for your donation. If you imagine, say, there’s 11 people, and if you do the first donation, you’ll save one of their lives, and if you do the other one, you’ll save the other 10 people instead. And you imagine looking at these people, and talking to them, and having them be able to beg you for these things, I think that it would be pretty crazy to say you should save the one in that case.

So that was a change in focus as well, and that also led to some other changes. So one of the ideas with Giving What We Can was to be a public register of people who’ve made this commitment to give. And I agree with everyone else that it’s somewhat more gauche to give and say that you’re giving than to give and not say you’re giving.

Rob Wiblin: It might be good to say a little bit more about what Giving What We Can is. So I think around this time, you were thinking, if people think an important part of moral life is benefiting others, what concretely can we build this group of people around doing? Which led you to start Giving What We Can?

Toby Ord: Yeah, exactly. So I already partly knew this, but looking at these figures for cost effectiveness made me realise just how much we could help with money. There’s lots of things one can do with one’s life, but it’s actually pretty hard to do something like save 100 lives. But in terms of our money, actually more than half the people in the UK — if they wanted to, and made it a real commitment over the rest of their life — could save 100 lives.

And that’s somewhat surprising, and it was worth dwelling on. And so I dwelled on this quite a lot and just really kept thinking about it. And one of the changes by thinking about it so much was that I came to see this less from this obligation frame that Peter Singer popularised, and instead to see it more just as I had a strong desire to save people’s lives if I could. And I think a lot of us would. When we read stories about people who took heroic actions, sometimes they took them at massive cost, and we don’t want to imagine ourselves in that position. But at other times we think, “Wow, imagine having been able to do this and to help these people.” So that’s something which is a bit of a shift of focus.

So yeah, one aspect was that shift of focus, to be thinking more in a direction of, actually, this is just something amazing that we can do. It’s morally significant and weighty, but not just as, “Oh, damn it, I’ve got to do that, then I’ll get back to playing my PlayStation” or something. But rather, no, actually, this is what it’s all about.

That was the first one. And then also just the fact that it was money that could do so much good. And what stops this being surprising is just once you realise how seriously unequal the world income distribution is: how the people that we want to be trying to help are people who have about 100 times less money than people in the richer countries, and so it’s not that surprising if money goes about 100 times further. And it turns out that the 100 times richer is already adjusting for the fact that money goes further in poor countries; if you don’t adjust for that, then you’ll see that you should expect your money to be able to do about 400 times as much good. And then it’s possible to get a bit of extra leverage on that as well, by choosing to help fund things like, say, deworming at schools — where it’s a lot more expensive to get it done just for yourself, whereas there are economies of scale when it’s done at scale.

So realising how much good we could do with giving: that’s why there was this focus that I had on starting a new organisation for people who just who want to make a personal pledge over the rest of their lives to give at least a tenth of everything that they earn each year to help the least fortunate people in the world where they could do the most good with this. And as part of that, to have this focus on effectiveness as well: once you could see that there were some ways of giving this money which wouldn’t achieve these amazing things — perhaps that would happen if you give to people in your own country — you wouldn’t have these huge benefits. So to get that focus on giving more and giving more effectively.

And because I’d seen this quantitative data on just how effective it can be, then I was able to notice that the average person could quite reasonably, instead of giving about 1% of their income over the rest of their life, give 10% — so give 10 times more — and then to give it somewhere that’s about 10 times more effective. And if they did both those things at the same time, it wouldn’t just be 20 times as good, it would be 100 times as good, because these multiply. And so I thought having this packaged together as one idea would be especially valuable.

Rob Wiblin: Yeah. So that led to the formation of Giving What We Can, where people would commit to give 10% of their income to the most effective charities.

So as I understand it, in those early days, the vision was: we were going to strip away a whole lot of the baggage that utilitarianism has, and we’re going to say merely that it’s good to provide large benefits to other people if it doesn’t come at a huge cost to yourself. And it’s better to provide even larger benefits: if you can provide 10 times as much benefit to others, then that’s roughly 10 times better again.

But we’re not going to say, as utilitarianism says, that the only thing that matters is wellbeing; that there are no other moral concerns: that’s highly controversial and probably wrong, so let’s leave that by the side. We’re not going to say that you can take any actions necessarily in order to benefit other people; we’re just going to say, inasmuch as you’re not violating any of these plausible side-constraints, like violating the rights of other people, then you ought to do it. We’re just going to leave this as an open philosophical question: what sort of side-constraints do we have, and how strongly do they bind?

Toby Ord: Yeah, that’s right. And in fact, I would go even further. Rather than describing it as a stripped-down version of utilitarianism, I’d say we take these two insights that utilitarianism made clear, and that’s all we’re going to take: just these two insights. And we’re going to say: Is it reasonable for everyone to agree with those anyway? And on inspection, I think it is. They’re just things that we should have always come to believe; it just was maybe less obvious with different approaches to ethics.

And then to say, yeah, let’s build a movement around these ideas. And in doing so, we should, at least through our giving, be able to do something like 100 times or more good than we were previously able to do. And then to be excited about that, and to share the ideas about it; and share the information that scientists, economists, health researchers had developed about the places where we can do a lot of good for others.

Rob Wiblin: Was effective altruism at the time kind of construed as a theory of normative ethics, or was it more analogised to something else?

Toby Ord: So at that time, there wasn’t quite effective altruism. There was Giving What We Can. I had met Will MacAskill in 2009, in April, and then over the next seven months, we’d worked really hard together and taken a whole lot of these ideas I’d already developed around an organisation that would become Giving What We Can, and actually just making it happen. So we were doing that, and then in the years after that, the next couple of years, I was thinking academically around this more general idea — not just when it comes to giving, but when it comes to lots of moral thought about our lives, about what we just called “positive ethics.” So that was the thinking there.

And at a similar time — 2011, I think — Will and Ben Todd gave a talk about ethical career choice, applying these ideas to career choice. They worked together and founded 80,000 Hours, and ultimately brought in other people as well, and really took those ideas to their conclusions. And then once we had these two organisations — Giving What We Can, thinking about what we can do with our incomes, and 80,000 Hours, thinking about what we can do with our careers — it was even more important to have some word that referred to all of this.

As a practical matter, we needed to set up a charity within the Charity Commission in the UK. And we wanted to just do that once and set up an umbrella that both these organisations could exist under. So we ended up having a big vote on that to try to pick the name for this, and we ended up picking the “Centre for Effective Altruism.” And the idea behind that was that in naming it, we would probably also be naming this nascent movement that people were really starting to get excited about.

Rob Wiblin: Did you imagine that this would become viewed as a moral philosophy? Or was it more like environmentalism, or some other attitude or disposition or concern that people have that isn’t viewed as an actual theory of ethics?

Toby Ord: Even when I was thinking of it as an academic project under the label of positive ethics, it was still a very broad project that could encompass many different theories. In fact, I was suggesting that any moral theory worth its salt should be endorsing these kinds of ideas. In some ways you could think of that a bit similarly to the philosophical ideas behind environmentalism or feminism, where we’re saying that women really matter, that’s the distinctive idea: that they matter just as much as men, that we’re just all people. Or environmentalism, saying that these nonhuman aspects of the environment — certainly animal lives, perhaps also plant life — that this is something that also has some kind of normative weight to it.

And in both those cases, they’re not a particular moral theory or something; there’s not just one particular kind. And feminists and environmentalists also care about heaps of other things; they come to the table with other moral commitments. Maybe they’re a Christian or a Muslim or an atheist with a particular moral view — or they’ve never really thought that much about other aspects of morality, but if you push them, they’d have a bunch of ideas — and they’re just coming together because they support just enough principles to have some things in common with these other people, to fight for something that they care about.

And ultimately, that’s how I see effective altruism, whether viewed as a philosophy, like the philosophy of environmentalism, or whether viewed as a social movement.

The harm that comes from going all-in on one theory [01:07:38]

Rob Wiblin: OK, so that’s a whole lot of history about the motivation behind naming and trying to get the ball rolling on this idea of effective altruism. Let’s maybe get back to some of the broader lessons from moral philosophy that you were prompted to reflect on more by the collapse of FTX.

There’s three in the keynote that you gave at Effective Altruism Global: firstly, the harm that comes from going all-in on just one theory; then the need to assess everything, not just acts, on their consequences; and then having a model for how personal character and integrity can be really important — even if they don’t vary nearly as widely as do, say, the importance or the pressingness of causes or the effectiveness of different interventions.

Let’s do the first one first, because it was my single favourite one. It made me really light up when I was watching it, because it was one of the cases where you’re like, “I should have seen this before, but now I clearly do.” This is the issue that there are huge risks that come with trying to get 100% of what any moral theory wants without any compromises. Can you explain how that is?

Toby Ord: Yeah. Here’s a thought experiment. You’ve got three different options available to you: option A is to save one life, which is pretty good; option B is to save 99 lives; and option C is to save 100 lives. So the classical approach in utilitarianism is to look at each of those things and see which one has the best outcome — and it’s saving 100 lives. And then we say that that’s what you have to do.

But there’s also a development within consequentialism called scalar consequentialism, which says that actually, it’s a bit of a mistake to try to connect it with rightness or obligation in quite such a binary way. And in fact, I think that the earliest utilitarians, such as Jeremy Bentham and John Stuart Mill, actually don’t speak like that. They tend not to say that only the best thing is the right or acceptable thing. You know, John Stuart Mill says that “an act is right in proportion to the goodness of the consequences it creates.”

And I think that is a better way of thinking. Modern consequentialists have really started leaning in that direction, and they call it scalar consequentialism. The idea is that how right or important these different options are is in proportion to the goodness that they create. And so the really important thing in this case is that you don’t do option A: that’s where the really big gap is. Rather than saying C is uniquely interesting and important because it’s the best, it says, actually, B is pretty similar to C: 99 lives, 100 lives. And the real gulf is between that and A. So if you were going to just compress things down to a simple thing, it wouldn’t be “do C” — it would be “don’t do A.”

Rob Wiblin: Basically it would be a mistake to think that the big difference is between saving 100 lives and saving either 99 or one. Rather, the big difference is between saving one life and 99 or 100. That feels very intuitive, I think, to most people, because it’s far closer to how we think about decisions in real life.

Toby Ord: Yeah. And there’s a key aspect there that comes from a practical point, which is that, in that stylised example, that was the only thing that mattered, and the only thing there was saving these lives. But in reality, the different options involve other kinds of effects in the world, and perhaps those effects matter. So you might have a moral theory that includes things other than happiness. And maybe in trying to absolutely maximise out on happiness, you start doing damage to some of these other things.

Rob Wiblin: Yeah. So what goes wrong when you try to go from doing most of the good that you can, to trying to do the absolute maximum?

Toby Ord: Here’s how I think of it. Even on, let’s say, utilitarianism, if you try to do that, you generally get diminishing returns. So you could imagine trying to ramp up the amount of optimising that you’re doing from 0% to 100%. And as you do so, the value that you can create starts going up pretty steeply at the start, but then it starts tapering off as you’ve used up a lot of the best opportunities, and there’s fewer things that you’re actually able to bring to bear in order to help improve the situation. As you get towards the end, you’ve already used up the good opportunities.

But then it gets even worse when you consider other moral theories — if you’ve got moral uncertainty, as I think you should — and you also have some credence that maybe there are some other things that fundamentally matter apart from happiness or whatever theory that you like most says. There are these tradeoffs as you optimise for the main thing; there can be these tradeoffs to these other components that get steeper and steeper as you get further along.

So maybe, suppose as well as happiness, it also matters how much you achieve in your life or something like that. Then it may be that many of the ways that you can improve happiness, let’s say in this case, involve achievements — perhaps achievements in terms of charity, and achievements in terms of going out in the world and accomplishing stuff. But as you get further, you can start to get these tradeoffs between the two, and it can be the case for this other thing that it starts going down. If instead we were comparing happiness first and then freedom, maybe the ways that you could create the most happiness involve, when you try to crank up that optimisation right to 100%, just giving up everything else if need be. So maybe there could be massive sacrifices in terms of freedom or other things right at the end there.

And perhaps a real-world example to make that concrete is if you think about, say, trying to become a good athlete: maybe you’ve taken up running, and you want to get faster and faster times, and achieve well in that. As you start doing more running, your fitness goes up and you’re also feeling pretty good about it. You’ve got a new exciting mission in your life. You can see your times going down, and it makes you happy and excited. And so a lot of metrics are going up at the start, but then if you keep pushing it and you make running faster times the only thing you care about, and you’re willing to give up anything in order to get that faster time, then you may well get the absolute optimum. Of all the lives that you could live, if you only care about the life that has the best running time, it may be that you end up making massive sacrifices in relationships, or career, or in that case, helping people.

So you can see that it’s a generic concept. I think that the reason it comes up is that we’ve got all of these different opportunities for improving this metric that we care about, and we sort them in some kind of order from the ones that give you the biggest bang for their buck through to the ones that give you the least. And in doing so, at the end of that list, there are some ones that just give you a very marginal benefit but absolutely trash a whole lot of other metrics of your life. So if you’re only tracking that one thing, if you go all the way to those very final options, while it does make your primary metric go up, it can make these other ones that you weren’t tracking go down steeply.

Rob Wiblin: Yeah, so the basic idea is if there’s multiple different things that you care about… So we’ll talk about happiness in life versus everything else that you care about — having good relationships, achieving things, helping others, say. Early on, when you think, “How can I be happier?,” you take the low-hanging fruit: you do things that make you happier in some sensible way that don’t come at massive cost to the rest of your life. And why is it that when you go from trying to achieve 90% of the happiness that you could possibly have to 100%, it comes at this massive cost to everything else? It’s because those are the things that you were most loath to do: to just give up your job and start taking heroin all the time. That was extremely unappealing, and you wouldn’t do it unless you were absolutely only focused on happiness, because you’re giving up such an incredible amount.

Toby Ord: Exactly. And this is closely related to the problem with targets in government, where you pick a couple of things, like hospital waiting times, and you target that. And at first, the target does a pretty good job. But when you’re really just sacrificing everything else, such as quality of care, in order to get those people through the waiting room as quickly as possible, then actually you’re shooting yourself in the foot with this target.

And the same kind of issue is one of the arguments for risk from AI, if we try to include a lot of things into what the AI would want to optimise. And maybe we hope we’ve got everything that matters in there. We better be right, because if we’re not, and there’s something that mattered that we left out, or that we’ve got the balance between those things wrong, then as it completely optimises, things could move from, “The system’s working well; everything’s getting better and better” to “Things have gone catastrophically badly.”

I think Holden Karnofsky used this term “maximization is perilous.” I like that. I think that captures both what’s one of these big problems if you have an AI agent that is maximising something, and if you have a human agent — perhaps a friend or you yourself — who is just maximising one thing. Whereas if you just ease off a little bit on the maximising, then you’ve got a strategy that’s much more robust.

Rob Wiblin: Yeah, I think effective altruism is associated with this phrase “doing the most good.” And your talk made me think that maybe we should switch that to “do most of the good that you can” — because then you’re getting most of the value, because you’re above 50% of your potential. But it means that you have to give up so much less in the rest of your life, and it seems way more sustainable, much more plausible that you might be able to get above 50% of your potential and to be satisfied with that, than to try to reach 100% — which is kind of crazy.

Toby Ord: Yeah, I like it. I think we may still need a bit more work on the marketing. It certainly sounds like a very distinctive claim: “Do most of the good,” or “EA: 80/20 it in terms of doing good.” But I think it is getting at something. And I had actually always been a bit frustrated by some of these maximising-framed slogans or ways of expressing the point, because it always seemed to me that the real key thing is that we’re trying to get in the ballpark of the best outcomes, and that’s really what it’s all about. And one of the things I mentioned in the talk, actually, is “strive for excellence rather than perfection.” I think that is perhaps a way of summarising some of this. There’s good life advice on a lot of different dimensions.

But I think another subtle way that it’s true is that striving for excellence helps you think, “Maybe I shouldn’t be satisfied with how much everyone else achieves on this thing. Maybe I can just do way better.” Maybe when it comes to times for running races, you probably are pretty close to the limits. But for some other things, maybe you could actually do 10 times better than anyone before, by thinking outside the box and working out some new way to do it. And so it incentivises really taking the things you care about and trying to do amazing at them — whereas the perfection mindset feels like you’ve just got a couple of percentage points left to go, and you’re just trying to get them done. And it can lead to a “penny wise and pound foolish” type of behaviour, where you can’t see the forest for the trees. I think that the excellence approach fits the EA mindset better.

Rob Wiblin: Yeah, so one way that real life deviates from this schema — where you’re just choosing between saving one life, 99 lives, and 100 lives — is that we care about multiple different values. We care about more things than merely saving lives. The picture is more complicated. There’s also uncertainty about what effects your actions are going to have. And one reason why you might have serious reservations about a course of action is that it carries with it enormous volatility, enormous uncertainty about the effect that it’s going to have.

So I think another way that you might go wrong if you’re just absolutely maximising the expected value of something, without any compromises, is that those last few things that you do might be extremely risky. They might have positive expected value, but bring with them enormous risk of downside. So you might expect recklessness is the sort of thing that would result from being completely uncompromising in the pursuit of just one goal.

Toby Ord: Yeah, that can be especially true if the thing that you’re trying to maximise has within it a claim that you should be trying to maximise the expected value. There are different attitudes you can take to risk, and different ways that we can conceptualise what optimal behaviour looks like in a world where we’re not certain of what outcomes will result from our actions. This is studied in ethics and in decision theory and the study of rationality within philosophy. Expected value is probably the most dominant theory, which says that you should weight the outcomes by their probabilities that they occur.

But there are other approaches that involve being more risk-averse than that, which also have some credibility. And I think we should have some uncertainty around these things, and we should certainly be risk-averse about things like money that have diminishing marginal value.

Perhaps another way that one can go wrong, in fact, is: suppose you thought that money mattered a lot and was very valuable because money could be used to produce the end that you’re seeking in the world. No matter what charity it is, you could earn money and then give it. But then you’d get an extreme version of this, because ultimately, if there are diminishing returns on this money, if you’re risk neutral about money then that’s a big mistake. If there are diminishing returns on something, then that implies that you need to be risk-averse about it. And people can often forget that.

Moral trade [01:21:56]

Rob Wiblin: Yeah. So this obviously makes sense if you’re uncertain about what is valuable or you do just directly value multiple things yourself. But in the talk you explain how this is still very important, even if you personally only care about one thing. Can you explain how that is?

Toby Ord: Yeah. Suppose you’re completely certain, and you think only happiness matters. So you’re not worried about the moral uncertainty case, and you’re not worried about this idea that other things might go down in that last 1% of optimisation, because you think this really is the only thing that matters.

Well, at least if you’re interested in effective altruism, then you’re part of a movement that involves people who care about other things, and you’re trying to work with them towards helping the world. And so this last bit of optimisation that you’re doing would be very uncooperative with the other people who are part of that movement.

So this can be connected to a broader idea that I’ve written about called moral trade, where the idea is that, just as people often exchange goods or services in order to make both of them better off — this is the idea that Adam Smith talked about: if you pay the baker for some bread, you’re making this exchange because you both think that you’re better off with the thing the other person had — you could do that not just about your self-interested preferences, but with your moral preferences. And in fact, the theory of trade works equally well in that context.

For example, suppose there were two friends, one of whom used to be a vegetarian but had stopped doing it because maybe they got disillusioned with some of the arguments about it. But they’d kind of gone off meat to some degree anyway, and so it wouldn’t be too much of a burden if they went back to being a vegetarian. That person cares a lot about global poverty, and their friend cares about factory farming and vegetarianism. Well, they could potentially make a deal and say, “If you go back to being a vegetarian, I will donate to this charity that you keep telling me about.” They might each not be quite willing to do that on their own moral views, but to think that if the other person changed their behaviour as well, that the world really would be better off.

And you can even get cases where they’ve got diametrically opposed views. Perhaps there’s some big issue — such as abortion or gun rights or something — where people have diametrically opposed positions, and there are charities which are diametrically opposed. And they’re both thinking of donating to a pair of charities which are opposed with each other. And then maybe they catch up for dinner and notice that this is going to happen. And they say, “Hang on a second. How about if instead of both donating $1,000 to this thing, we instead donate our $2,000 to a charity that, while not as high on our list of priorities for charities, is one that we actually both care about? And then instead of these effects basically cancelling out, we’ll be able to produce good in the world.”

So that’s the general idea of moral trade. And you can see why the moral trade would be a good thing if it’s the case that even though people have different ideas about what’s right, and these ideas can’t all be correct, if they’re generally, more often than not, pointing in a similar direction or something — such that when we better satisfy the overall moral preferences of the people in the world — I think we’ve got some reason to expect the world to be getting better in that process. In which case, moral trade would be a good thing. And it’s an idea that can also lead to that kind of behaviour where you don’t do that last little bit of maximising.

Rob Wiblin: Yeah, we’ll stick up a link to your paper on moral trade. There’s a bunch of practical issues which I think cause it to be less common than it might be. You could imagine, for example, these two extremely opposed political groups, maybe they don’t often meet up for dinner to have conversations about how they could both stop competing in politics and do something that they both regard as almost as valuable as what they would do otherwise. There’s also this issue that if you’re paying someone not to do something that you think is bad but they don’t think is bad, then you worry that you might incentivise people to claim that they’re going to do that thing anyway in order to get paid not to do it. So that can be another practical challenge.

But the underlying idea that people can make deals where they’ll stop doing something that they don’t care about, but that someone else thinks is really bad, in exchange for the other person doing the reverse, that there’s potential benefits to all parties here, is kind of mind-blowing, I think.

Toby Ord: Yeah, I was pretty excited about it, and it would be fun if people took it up more. I get into these challenges in the paper. People can have fun seeing all the problems. But an interesting one is that there’s this problem when you’re making a trade with the baker, you need to be sure that if you give them the money, they’ll give you the bread. And generally we’ve got a functioning society where that works out and we don’t have to put them into escrow or something: we can just hand over the things and we don’t really care which order we do it in. So that’s one issue: knowing that if you give the thing, they will give you their thing.

But there’s this interesting extra case with moral trade, where you have to know that if you hadn’t given them the thing they wouldn’t do their part. That doesn’t come up with the baker: we know that if you hadn’t given the baker the money, they wouldn’t have given you the bread. But when it comes to, say, donating to that third charity, the compromise charity, they might give you a receipt showing you that they’ve donated £1,000 to that charity. But then you think, “Hang on, maybe they were already going to donate to that charity, and they haven’t done an extra thing for me.” So that’s another problem.

There are a few of these challenges that come up. I think that in some cases they can be dealt with, especially in cases where the two parties are just pretty trusting of each other. It’s a bit harder for it to happen between complete strangers, and a bit easier to happen with a friend or something like that.

Rob Wiblin: Just to tie back how this is connected to what we were talking about: you were talking about cooperation within a social movement, say effective altruism, that it would apply more broadly.

But of course we’re all part of this enormous world full of other people who potentially care about what we’re doing. And one reason why you might be reluctant to take a course of action is that because it’s so monomaniacally extreme in one dimension, other people are going to massively disapprove of it and think that you’re doing something that’s really mistaken or wrong on their point of view. And so there’s very good reasons — if you want to exist in a society where people generally treat one another with courtesy, and are concerned about the views of others — that you want to cultivate a temperament in which you’re not so monomaniacal that you completely disregard the values of other people and give no weight to their concerns about what you’re doing.

Toby Ord: Yeah, that’s right. And I think sometimes people — I don’t even know if they want to think this — but they feel a bit compelled to think that maybe we need to be a kind of Machiavellian do-gooding society that’s trying to do good above all else, and perhaps in very uncooperative ways. But no one really wants to do that. And I think our best theories don’t tell us to do that either. I think that they instead tell us to strive to be a community of people who are earnestly trying to be good people, as well as trying to do good in the world. And I think that’s often quite an easy tradeoff to make. Maybe trying to be a good person while also being a shrewd businessperson is a difficult tradeoff. But actually, in this case, I think it tends to work quite well.

We shouldn’t lose sight of the idea that we really care about others and about making a really big positive difference, but also we shouldn’t go anywhere really near the edges of behaviour that is thought to be seriously problematic. In fact, in The Precipice I wrote a comment like that. One of my pieces of advice was:

“Don’t act without integrity. When something immensely important is at stake and others are dragging their feet, people feel licensed to do whatever it takes to succeed. We must never give in to such temptation. A single person acting without integrity could stain the whole cause and damage everything we hope to achieve.”

I stand by that. I wish people like Sam had actually taken this kind of advice.

Rob Wiblin: Yeah, I think quite a lot of people said things like that.

Toby Ord: Indeed.

Rob Wiblin: Yeah, we could quote a whole bunch of them. I suppose it could have been more prominent. I suppose there’s lots of points that people can make, and that was one point among many different observations that people made about how one ought to behave and how one might do more good.

I feel like really the problem though is that there are many people who will hear exhortations like that and think, “Yes, that sounds right. That’s the way that I’m inclined to behave anyway. And I’m glad that people who I look up to are telling me to behave with integrity.” But some people just aren’t interested in hearing exhortations about integrity from others. It doesn’t ring true to them for some reason, or it just doesn’t fit with their personality very well. And those folks, you could say it twice as often, but it’s just going to go in one ear and out the other. It just makes it very tricky to coordinate a group of people and make sure that no one ever does anything bad, because there’s 1% of people who are most inclined to do things that are bad, and they’re just very hard to reach.

Toby Ord: Yeah, I think that is a big challenge. But one way that you can try to deal with it is if we’re clear enough about these norms, then it should also be somewhat clear that you shouldn’t associate too much or follow people who are breaking these norms, if you can tell that they are.

Rob Wiblin: Yeah, or give positions of power and influence to people who seem sketchy, basically.

Toby Ord: Yeah, exactly. And so obviously, if someone does something terrible, then we’re going to call them out and not follow them or boost them. But I think that one needs to draw an even larger border around those kind of dangerous people, and to say it’s true that to some extent there’s innocent until proven guilty, but there’s a different kind of measure or standard that’s needed for, say, joining someone’s organisation or trying to promote their work. So I think that we should be more careful about that.

And it’s not that if you think someone’s a bit sketchy, you need to whisper to everyone they’re sketchy and so forth. But maybe don’t be afraid to say that, if someone’s asking or thinking about joining their organisation. And saying, “I’m not sure about their integrity” is different to actually accusing them of things and so on. So I think that we might be able to get something to work around that — that doesn’t descend into being some kind of rumour campaign or something like that, but does involve sharing a bit of information as to whether we think that someone seems to be unimpeachable, or whether it’s perhaps the other way around.

Global consequentialism [01:32:44]

Rob Wiblin: Yeah. OK, let’s push on and talk about another lesson that relates to how we should actually make decisions in the real world when we’re trying to do good. First, can you explain what “naive utilitarianism” is?

Toby Ord: Many people think that utilitarianism tells us, when we’re making decisions, to sit there and calculate, for each of the possible options available to you, how much happiness it’s going to create — and then to pick the one that leads to the best outcome. Now, if you haven’t encountered this before, you may think that’s exactly what I said earlier that utilitarianism is, but I hope I didn’t make this mistake back then, and I think I probably got it right.

So naive utilitarianism is treating the standard of what leads to the best happiness as a decision procedure: it’s saying that the way we should make our decisions is in virtue of that. Whereas what utilitarianism says is that it’s a criterion of rightness for different actions — so it’s kind of the gold standard, the ultimate arbiter of whether you did act rightly or wrongly — but it may be that in attempting to do it, you systematically fail.

And this can be made clear through something called the “paradox of hedonism”: where, even just in your own life, suppose you think that having more happiness makes your life go better, and so you’re always trying to have more happiness. And so every day when you get up you’re like, “What would make me happy today?” And then you think, “Which of these breakfast cereals would make me happiest?” And then you’re having it and you’re like, “Would chewing it slower make me happier?” And so on. Well, you’re probably going to end up with less happiness than if you were just doing things a bit more normally. And it’s not really a paradox; it’s just that constantly thinking about some particular standard is not always the best way to achieve it.

And that was known to the early utilitarians. In fact, they wrote about this quite eloquently. They suggested that there could be other decision procedures which are better ways of making our decisions. So it could be that even on utilitarian standards, more happiness would be created if we made our decisions in some other way. Perhaps if we are trying this naive approach of always calculating what would be best, our biases will creep in, and so we’ll tend to distribute benefits to people like us instead of to those perhaps who actually would need it more. Indeed, there is a lot of opportunity for that, including your self-serving biases. You might think, “Actually, that nice thing that my friend has would create more happiness if I had it, and so I’m just going to swipe it on the way out the door.”

The concern is that actually there is quite a lot of this self-regarding and in-group bias with people, and so if they were all trying to directly apply this criterion and to treat it as a decision procedure, they probably would do worse than they would do under some other methods. And for a thoroughgoing utilitarian, the best decision procedure is whichever one would lead to the most happiness. If that turns out to be to make my decisions like a Kantian would, if that really would lead to more of what I value, then fine, I don’t have a problem with it.

And so one thing that’s quite interesting is that utilitarianism, in some sense, is in less conflict than people might think with other moral theories, because the other moral theories are normally trying to provide a way of making the decisions. Whereas utilitarianism is potentially open to agreeing with them about their way of making decisions, if that could be grounded in the idea that it produces more happiness.

Rob Wiblin: I guess we’re talking about this distinction between criterion of rightness and decision procedure in the context of utilitarianism. But I imagine that a similar phenomenon would show up almost regardless of your goal or regardless of the moral philosophy: that the way of achieving the goal that’s specified might not be to think about that goal all the time. You might need some different process to actually exist in the world in order to get there.

Toby Ord: Yeah, I mean, I think that everyone, for both your own life and also for any moral theory, I think you need to have this distinction. In the classic philosopher style, one could imagine very unrealistic but clear thought experiments where, if someone could tell how you were making decisions — perhaps by measuring your brain activity or something — and they said that they were going to cause huge amounts of suffering to yourself in the future if you make decisions according to a particular method, then I would like to hope that you will stop making them according to that method. Or if they were going to cause even suffering to other people, because of how you’re making your decisions, that you would think, “I guess it actually would be better if I temporarily or permanently switched to making them in a different method.” Better by some other lights.

And so I think that for all moral philosophers, or just general people, it’s useful to keep this distinction in mind: that there’s the question of, “How ought I practically to make my decisions?” and then there’s a question of, “Which actions would be right?” And a distinctive thing about consequentialism — or at least what I think is the best version of it, global consequentialism — is to say we use the same standard for both things. So when we’re saying which is the best act or the right act, we assess it in terms of how good the outcome is. And if we’re saying which is the best way to make our decisions, we assess that by what would be the outcome if we made our decisions in that way.

Rob Wiblin: Yeah, OK. I suppose many people will have heard of act utilitarianism, which is this idea that for every action that you could take, you decide whether it’s right or wrong or decide how good it is based on the consequences that it has. And then that produces various problems, where you might be inclined to violate various social conventions if you think that they have better consequences.

So one reaction to that that people have had is to say what we should evaluate is not each individual action, but rather rules — like principles of behaviour. And we should assess if, in general, one follows this rule in life, does that produce better consequences? I think global consequentialism is this extension to say that plausibly everything could be evaluated based on this criteria of whether it’s conducive to wellbeing: not just rules, but also social institutions or someone’s character or dispositions. Am I understanding it right?

Toby Ord: Yeah, you are. And yeah, you can kind of include everything, I think some of the famous papers on this have said, you know, “Sewer systems!” “Climates!” and just try to apply really everything, and any kind of thing you’re trying to evaluate. Basically, the idea is to think of all the things of that kind that we could have, go through them, perhaps one by one. (This is an idealised way; it’s not that you practically do this, that was the whole point of what we’re talking about.) What determines whether something is, say, the right social system? Or the right way of electing political representatives? What determines that is if you imagine all the different ways we could do it, and then if we happen to know what the consequences of those would be, that the one that has the best consequence is the best method.

That’s the idea of setting the gold standard, even if we can never actually be in a situation where we have enough information to be fully sure of it.

Rob Wiblin: So what sort of extra insights might we get from this switch from evaluating only acts on the consequences, to being willing to evaluate everything in terms of the consequences that it has?

Toby Ord: I think there’s a couple. I mean, while one can assess any kind of thing in this way… Actually, I noticed recently someone set up a website for literally assessing anything: they just pick pairs of things, probably from Wikipedia, and you just vote on which one’s better — whether it be presidential candidates or political systems or brands of toothpaste or whatever — and just trying to come up with an ordering, of all things. I think that’s going a bit too far. And actually, I think even as an advocate of, within consequentialism, I think the global versions are the best, I think only things within one category can be compared.

Rob Wiblin: Or at least that it would be good to restrict ourselves in that way.

Toby Ord: I’m not sure what it means if you go beyond that. But let’s say we’re thinking about these things. Now, some of these things we could assess are still pretty trivial. What’s the right Pokémon or something like that for me to have a toy of in my bedroom? Or something. “Well, of all the Pokémon…”: it’s not a very fertile line of thought. It’s not going to lead to much of an outcome.

Whereas there are other things that turn out to be very important to think about, other categories. And I think that we can really get a good grip on that if we really take a step back in moral philosophy and look at these three great traditions of moral philosophy. So there’s consequentialism, which we’ve already talked about. Another one is deontology, and Kantianism is a famous example.

Rob Wiblin: This is sort of rules-based ethics.

Toby Ord: Yeah, exactly. The idea there is that there’s these rules or principles which are unbreakable. The Ten Commandments would be another example that’s familiar to people. So there are certain kinds of unbreakable principles, and ethics is fundamentally about following such rules that govern our behaviour. And I think that seems like a plausible account of what ethics is, if you’re just telling a person who was like, “What does the word ‘ethics’ mean?” It’s the rules that govern our behaviour or something. And similarly with the consequentialists, where they say it’s about creating outcomes that are overall better for people: it also seems plausibly what it’s about.

And then the third area is that it’s about being a good person. So it’s about having the right kind of character, embodying various virtues. And I think, again, ethics is about being a good person, is a pretty plausible understanding of what it’s all about.

And that’s partly what leads to the conflict between these views. Because when you’re really thinking about your own one, you’re like, well, how could it be anything else? And one of the interesting features that I said with consequentialism is that it’s asking these questions at a slightly different level to the others. Because it could say, actually, we could assess rules — and I’m just saying that the best set of rules are the set of rules that lead to the best outcomes. And maybe that’s the set of rules that you’re going on about. And also that the right virtues or the character traits — that, if you were to possess them, would be ideal — that we could assess those in terms of the consequences they lead to. So that a character trait counts as a virtue if it systematically leads to better outcomes. So this idea of global consequentialism has this ability to step up a level and hopefully to capture the best of those three great traditions all at once.

And it does seem to me that these traditions really did focus on pretty important areas: rules governing behaviour, or perhaps a generalisation of rules, which I call “decision procedures”; and then also thinking about the kind of character that one has. These are both very general things which govern a lot of what happens. You could think in some ways of decision procedures as being the things that govern the conscious choice of outcomes, and as character traits as being these more fundamental things behind the scenes that govern the unconscious choice of what you do.

Rob Wiblin: Yeah. So as I understand it, you wrote your PhD thesis on this topic of global consequentialism.

Toby Ord: That’s right.

Rob Wiblin: And maybe trying to unify different areas of moral philosophy using this concept. Have people who work within the tradition of deontology or virtue ethics accepted that they’re just subfields of global consequentialism, or do they still think that there are some distinctive things that they’re bringing?

Toby Ord: You know, I don’t think that they have accepted that. There’s a great philosopher, Julia Driver, who has written quite a lot about this consequentialist approach to virtue. But not many people have read my PhD thesis, though, so maybe they haven’t had the chance to be convinced.

Rob Wiblin: They haven’t been convinced yet.

Toby Ord: I’d always been planning to actually release it as a book, and in fact, I wrote it that way rather than writing it in the somewhat more dry style. It’s still in the style of an academic book — don’t get too excited — but I actually have put this up on my website. So if people want to know a whole lot more about this, or they’ve got some devastating counterexample that they think I might not have thought of, then feel free to download it, have a read, and you can find out as much as you want about this stuff.

Rob Wiblin: I think it might be pretty intuitive that one way that you would assess principles of behaviour or character traits is by asking the question, “If we had more of this character trait, or if we followed this rule, would that be conducive to a world that has lots of wellbeing and flourishing in it?” But people within deontology or virtue ethics who have encountered this idea and don’t think that it captures everything that they’re doing, what’s the reason for that?

Toby Ord: I think that it’s because they think that there’s something else other than good consequences — whether that be happiness or a richer conception of good consequences — that is the thing that grounds out the ideal set of rules to follow. For example, Kantianism says that you should act only on a maxim that you can rationally will should be universalised. So that means basically act only according to the principles which you’d be happy if everyone were to act under those principles. The idea or the hope of it was that there was almost a certain logical thing there. And there’s a famous example that if you were to lie every time you thought you could get away with it, then this would undermine communication. So that would be, in some sense, self-defeating to do this. Not just that it would lead to bad outcomes, but that even the whole concept of truth-telling would be kind of incoherent.

And I think that in some examples, like promise-keeping and lying, there’s a story like that that makes some sense. Other examples, I think it makes a lot less sense. So a classic example is that there is a maxim to always be the first to offer to help and be the last to complain. But Kantians would not be able to act under that maxim, because you can’t rationally will that everyone be the first to offer to help and that everyone be the last to complain. And yet does it feel like it’s immoral if you were to act under that maxim? So there are interesting cases like this, but that’s an example of how they’re thinking about it in quite a different way.

The philosopher Derek Parfit, who was my supervisor for this PhD thesis, has written on these areas, and he had this analogy where he thought maybe the Kantian and the consequentialist were climbing the same mountain but from different sides, and that when you perfect them and you have the best version of consequentialism and the best version of Kantianism, you find that they’re actually the same kind of theory.

I’m trying to say something that’s somewhat similar to that. But I think that’s hopeful. And I’m also hoping that it’s somewhat conciliatory. I’m not trying to rule the world of moral theories all under the stranglehold of consequentialism or something when I’m thinking about this. Instead, I just actually think it’s a compelling and conciliatory idea, but it wouldn’t be unreasonable if people with these theories thought, “Maybe he even comes up with the same virtues as I think are the virtues, but it’s for the wrong reason, and that matters.” So they’re welcome to that view. I would actually have to debate them on it.

Rob Wiblin: Yeah. So what was it about the collapse of FTX and the actions that Sam Bankman-Fried took that made you come back to this issue of global consequentialism?

Toby Ord: I started to, in trying to make sense of it, think that maybe Sam, while clearly being an intelligent person, and someone who would seem to have taken utilitarianism very seriously, yet he started doing these things where, as far as we can understand, committing crimes and effectively stealing a whole lot of money from I don’t know how many thousands of individuals, crazy stuff. I start to think, maybe he’s a naive utilitarian and never really got the memo on actually, really, that’s not the way to do it.

People who are opponents of utilitarianism often critique what I think of as a strawman: they say utilitarianism is a terrible theory, and then they go and describe naive utilitarianism. And the philosophers who talk about utilitarianism say, “That’s not actually our theory. We’ve never endorsed that. Have a look. We’re quite careful about this. In fact, we’ve explicitly said we don’t endorse that.”

And yet it seems like, perhaps because he wasn’t a philosopher, Sam may have just really been in the grip of this naive version. And he was aware that there were these calls, such as my own thesis or things — I’m not sure if he’s actually read it — but he was aware that there were things like this that suggest a tempering of this approach of “just try to calculate everything out.” And I think he might have thought that just sounds like good PR: that that’s what you tell people, but actually, then you just go back and Machiavellianly calculate things and act in this extremely uncooperative way with other moral players in morality, and also in an uncooperative way with all of the people whose lives he was affecting.

Again, maybe this is a galaxy-brained approach, and it’s better to understand it as just the pride and shame or greed or just various other emotions that are not particularly connected to ethics. But I do worry that he might have been thinking this, and that this could be a fairly widespread and mistaken understanding of utilitarianism within this movement and perhaps beyond that.

Rob Wiblin: I guess we don’t know exactly whether this was one of the mistakes that was made here, but it certainly does seem like a trap that someone could fall into — especially if they conceive of themselves as being particularly hardcore, and not wanting to make any compromises in order to be nice and cooperative and follow common sense with other people.

Toby Ord: Yeah, and I don’t think that being hardcore and not making compromises are virtues, either on utilitarian grounds or on any other grounds. I don’t think they lead to the most good systematically. But sometimes there’s a culture where people try to value these types of things, and I think that’s a mistake to try to compete on those.

Rob Wiblin: To be clear, the mistake that we’re worried was made is leaning too hard into, with each decision that you’re making, thinking, “Will this produce good consequences?” and just trying to analyse it out based on the effects that you think it will have.

Rather than saying, “If I adopt that approach, there’s a lot of traps that I could fall into in terms of self-serving bias, and I could just massively get it wrong. I could just estimate the consequence of the actions extremely inaccurately. And so in practice — especially when I have to make a lot of decisions, and I have to do them quite quickly, and these are often quite consequential decisions — instead, I’m going to need a different decision procedure, because a different decision procedure that analyses less the immediate consequences that I think is going to, in fact, have better consequences. So I should be more rule-abiding and try to cultivate characteristics in my decision making, in my behaviour, that will result in, on average, me having more positive effects on the world.” Like having integrity, having prudence, and thinking about and not being willing to overrule other people’s autonomy and things like that.

Toby Ord: Exactly. Ultimately, effective altruism isn’t the same thing as utilitarianism or consequentialism. And as I said earlier, in some ways the whole point of creating this new thing was to take this element that the utilitarians saw clearly, but then to build something around that element which we could all actually agree with.

But there’s still a bunch that we can learn from the utilitarian and consequentialist moral philosophers, because there has been a tradition of thinking seriously about doing good. Now, if you were to follow that theory, you might mistakenly only be thinking about doing good and not be thinking enough about constraints on your actions and other things. But at the very least, they had been taking doing good seriously and had been thinking very carefully about some of these kinds of traps that you could fall into. And so that’s why I think it’s an exciting opportunity to learn some lessons from these areas of philosophy.

Figuring out the value of personal character and integrity [01:53:56]

Rob Wiblin: OK, pushing on to the third insight here. Another lesson that I really liked was having a way of conceptualising what sort of impact personal character and integrity have on your impact, which is something I thought was important, but I didn’t have any kind of mental framework for thinking about how that could be quite important.

So what’s the naive reason why someone might not think that character and virtue are so important, by the lights of utilitarianism or effective altruism?

Toby Ord: Yeah, good, because I think that there is a naive reason here, and there’s actually some truth to it.

So the reason that effective altruism focuses so much on impact and doing good — for example, through donation — is that we’re aware that there’s this extremely wide variation in different ways of doing good, whether that be perhaps the good that’s done by different careers or how much good is done by donating $1,000 to different charities.

And it’s not as clear that one can get these kinds of improvements in terms of character. So if you imagine an undergraduate, just finishing their degree, about to go off and start a career. If you do get them to give 10 times more than the average person, and to give it 10 times more effectively, they may be able to do 100 times as much good with their giving, and that may be more value than they produce in all other aspects of their life. But if you told them to be a really good character in their life, and that was the only advice, and you didn’t change their career or anything else, it’s not clear that you could get them to produce outcomes like that.

Rob Wiblin: It’s not clear what having 100 times as much virtue looks like.

Toby Ord: No. And you probably couldn’t have 100 times as much virtue, and maybe you could have a bit more virtue. And then there’s a question about how much goodness does the virtue create or something, but it doesn’t seem like it comes from the same kind of distribution. It’s unlikely that there’s a version of me out there with some table calculating log-normal distributions of virtue or something like that.

And I think that’s right. But how I think about it is that, ultimately, in terms of the impact we end up having in the world, you could think of virtue as being a multiplier — not by some number between 1 and 10,000 or something with this huge variation, but maybe as a number between -1 and +1 or something like that, or maybe most of the values in that range. Maybe if you’re really, really virtuous, you’re a 3 or something.

Rob Wiblin: Yeah. So the point here is that even though virtue in practice doesn’t seem to vary in these enormous ways — in the same way that, say, the cost effectiveness of different health treatments might, or some problems being far more important or neglected than others — all of the other stuff that you do ends up getting multiplied by this number between -1 and 1, which represents the kind of character that you have, and therefore the sort of effects that you have on the project that you’re a part of and the people around you.

And maybe we’ll say a typical level of virtue might be 0.3 or 0.4, but some meaningful fraction of people have a kind of character that’s below zero. Which means that usually, when those people get involved in a project, they’re actually causing harm, even though people might not appreciate it — because they’re just inclined to act like jerks, or they lie too much, or when push comes to shove they’re just going to do something disgraceful that basically sets back their entire enterprise. Or there might be various other mechanisms as well. And then obviously it’s very clear that going from -0.2 to 2 is extremely important, because it determines whether you have a positive or negative impact at all.

Toby Ord: Yeah. And another way to see some of that is when you’re scaling up on the raw impact. For example, suppose you’ve noticed that when founders set up their companies, some of these companies end up making a million dollars for the founders, some make a billion dollars: 1,000 times as much. This is one of these heavy-tailed distributions. And then if you’ve got a person with bad character, the amount of damage they could do with a billion-dollar company is like 1,000 times higher as well as the amount of good they could do with it is 1,000 times higher.

So it’s especially important if someone is going to go and try to just do generically high-impact things that have a positive sign on that overall equation and not a negative one. Another way to look at that is when you have something like earning to give, because there’s an intermediate step where it turns into dollars — and dollars are kind of morally neutral depending on what you do with them, or at least morally ambiguous, as opposed to it directly helping people — then there’s more need to vet those people for having a good character and before joining their project or something like that.

Rob Wiblin: Yeah, makes sense. In your talk at Effective Altruism Global, you said that your impression is that people you meet who are passionate about effective altruism are often, for example, really generous and empathetic. They have those virtues maybe quite a bit more than average in society. But you’d notice that sometimes people seem to lack the virtue of earnestness. I didn’t get what you were talking about there, or it didn’t totally resonate with me. I think that might just be because I don’t meet that many people anymore, so I don’t have my finger on the pulse.

So first off, why would you class earnestness as a virtue?

Toby Ord: You’re right that it’s not normally considered a virtue. I think it doesn’t sound totally unreasonable, and I would be open to revisionary virtues. But when it comes to doing good in the world and working well with other people, it can be very useful to just be transparent about your motivations, and for it to just be clear to people that you just want to be a good person. And to dispense with the sarcasm and the arch comments and so on, or with the kind of oily or slimy presentation in an attempt to be slick about things because you think it’s being more professional. Or to dispense with having a poker face so no one can tell what you’re really thinking and so on.

Now, if you want to be a good businessperson, you might need to actually go in those directions a bit. But actually, one of the nice things about being part of a community just trying to help people in the world is that you don’t really have to build up the poker face. Maybe there’s some rare situation where by bluffing a bit more, you can create a better outcome; maybe you can convince a big company to back down on their factory farming techniques because they think you had a stronger hand than you really did.

And so I think we tend to think that, well, because those things might come up, it would be really good if I was able to have a poker face, or to lie if I needed to, or things like that. Whereas if you’re actually the kind of person who can’t lie if you needed to, and people can tell that, and you’re just transparent, and people can see that you’re just doing this stuff because you’ve got this maybe even slightly gauche level of “I just want to help people,” as opposed to “People are so terrible.”

So I think that this kind of earnestness is actually a benefit for cooperating and for having people be less suspicious and less thinking, “There must be some other motivation. Why would there be a group of people based around helping people? That’s crazy. They must be out for something.” Perhaps leading to these types of things we were talking about earlier, where someone says, “Why would these CEOs say this? They must have an ulterior motive or something.” It’s a common line of thinking when someone says something. Whereas if you are just actually pretty transparent, and you have goals which it turns out most people share or at least are not against, then there’s more chance they’ll just let you get on with actually doing that good work.

Rob Wiblin: Yeah, I have a related rant — that I think I’ve said on the show before, but maybe not recently — which is just that some people get interested in persuasion techniques, where they start reading guides to how you can present arguments extremely compellingly and maybe be extremely charismatic so that people are more likely to believe the things that you’re saying. And my observation is that this stuff is not very useful, and basically you should just say what you think and then give the explanation, as clearly as you reasonably can, for why it is that you believe that. And that trying very actively to be persuasive specifically, rather than to be clear, say, just ends up turning people off in the long run — because you come across as someone who’s trying to be persuasive, and people don’t like that. I don’t like it.

Toby Ord: Yeah. I mean, in that case, maybe clarity and directness work well if you’ve got a good point to make, and they systematically work better the better your point is. Whereas if you were someone who had a bad point, but it was a point that was very much in the interest of your company, maybe they would do better by learning all these persuasion techniques because the other technique doesn’t work for them. So I guess what I’m suggesting is people who are trying to argue something that’s actually true, and people who are trying to promote something that is actually good —

Rob Wiblin: They should play to their strength.

Toby Ord: Yeah, exactly. Play to their strengths, and enjoy being in this situation where you don’t have to be cagey about your motivations and so on. And one can get more value from this earnestness in situations like face-to-face conversations, where we have very good ability to notice tells in the other person’s behaviour and so on, such that you end up being more transparent and it’s harder to lie and bluff — as opposed to, say, on Twitter, where you offer your 280 characters and then there’s hardly any bandwidth, and the other person thinks, “Well, you would say that, wouldn’t you?” because they have no idea who you are or anything else about you. So it is a bit harder to do this kind of thing in those low-bandwidth communications.

But I think that people in effective altruism, and other areas of trying to do good in the world, should actually be more earnest, and just try to be a community of people who are trying to be good people and trying to do good at the same time.

Rob Wiblin: Yeah. Are there any other virtues, or maybe revisionist virtues, that you think listeners might underrate and perhaps should cultivate more?

Toby Ord: Yeah, I think one that some people have rated in this community — Paul Christiano has spoken up in favour of it, for example — but I think is still underrated, is integrity. This is about consistently living up to your values, so acting in a principled way in private, even when no one else is there to see. You could think of it as the action version of honesty: where honesty allows people to trust your words, because you’ve built up a disposition of not lying, and integrity is about letting people trust your actions — so they know that even if you had the opportunity to defraud them or something like that, you’re just not the kind of person who would, or to betray them in some other way or something like that.

I think that, again, a bit like with earnestness, it involves having transparency about your values as well, being the kind of person who basically couldn’t betray people. And I think a lot of us start off like that — maybe we should avoid bluffing games like poker or Werewolf or things like this, to avoid training up the ability to lie or deceive or something like that. But ultimately, I think that integrity is a virtue I’d like to see more of as well.

Rob Wiblin: Yeah, we’ll stick up a link to that Paul Christiano post on integrity. I realised I’ve used that term “integrity” a bunch of times throughout the conversation. And I read this post by Christiano many years ago, probably nine or 10 years ago, and it’s just become so embedded in my mind that this is the operationalisation of what integrity is that I’d almost even forgotten that Paul Christiano had written this attempt to explain what integrity is trying to capture. If I recall, it’s on the technical side, but it’s definitely one worth looking up if you’re interested.

In prepping for this interview, I listened back to a fireside chat that you did with Will MacAskill back in early 2022. The question of character came up then as well, and both of you expressed concerns that an effective altruism community where there was a lot of money available might start attracting people who had some bad character traits. And one of those that you mentioned specifically was kind of a masculine recklessness and a bias towards taking action.

And to me, that’s actually a big worry that I have about the world these days, because I noticed that, systematically, people who are inclined to take big gambles end up making more money and advancing their career and getting into positions of power in society. And you then have people who are naturally maybe overconfident gamblers by personality, who are then satisfied with their personal brilliance because they’ve had a string of perhaps partly skill but also partly lucky successes. And those people end up making very important decisions that have effects on many other people.

I think arguably this phenomenon played out with Sam Bankman-Fried, but I think it’s actually a near-universal issue across business, government, politics, the media. You should expect to see it in all spheres of life where you typically need to take risks in order to reach the top of a hierarchy. Does this resonate with you?

Toby Ord: Yeah, this is definitely a big problem with the world, and in fact it is part of our reason why we have certain stereotypes or beliefs about politicians and businesspeople and so on — because they’ve been selected for a bunch of these traits, which are problematic.

And I think that as well as that, having more humility about these things and an awareness that you may have got lucky a bunch of times is very helpful. And to the extent to which you’ve had success, having humility about what got you there — and whether it was your awesome judgement in all cases, including future cases — or whether it may have been good fortune, we could all learn from that.

Rob Wiblin: Yeah. I think this is a case of, if you promote people who have achieved success, this is a case of the winner’s curse — where you end up both selecting people who are skilled and selecting people who are lucky. And I guess at the very top of the hierarchy, you should expect people who have had an awful lot of luck in their life: they made some big gambles and they happened to just pay off. I guess appreciating that this is an almost inevitable consequence of promoting people based on perceived performance — and ex-post performance rather than whether they made good judgments ex-ante, before we know the outcome.

It makes me think that you almost want to have some people at the top of businesses, at the top of organisations that operate like this, that are selected because they are the kinds of people who never would get promoted there — because they’re cautious people by nature, rather than gamblers who play double or nothing with their career. They should be in the decision-making room before really important decisions are made, especially ones with downside.

None of this matters that much if you’re just running a restaurant chain and the worst thing that could happen is that the restaurant chain goes bust and a different restaurant moves into the same building. But when we’re talking about stuff where the outcomes go well below zero, then I think we need to offset this phenomenon.

Toby Ord: Yeah. If there were ways of assessing ultimately what they achieved, you could think of it as a combination of a systematic effect and then some random variation. So maybe the individuals have a certain level of systematic effect — like the mean of their distribution of outcomes — and then there’s also this variance. And if you have a higher variance, that can help a lot if all that’s being selected for are the one-in-a-billion cases or something like that, or even the one-in-a-million cases. If you’re just looking at those, then getting, say, one-in-10,000-level lucky may be the easier way to get into that bracket than being one-in-a-million-level brilliant. And so if there’s a way of assessing the variance that they took and then effectively penalising them for it, just trying to subtract that off, then that could be a way of doing it.

I think that when it comes to stock trading companies, they have ways of doing this with their staff, where they’ve got various metrics to try to make sure that the staff didn’t make their money through just getting lucky. I don’t know exactly how that works. It might be that the methods they’ve got are just very tailored to those areas and couldn’t be learned from. But it might also be that they’re people who’ve just had to really squarely look this in the eye, and that they’ve found some things that they could teach the rest of us.

Rob Wiblin: Yeah, it’s interesting that I think government bureaucracies are almost the reverse of this. Typically, people think that government bureaucrats are often extremely risk-averse, rather than very risk-taking, even at upper levels. It doesn’t seem like they’re selecting for particularly reckless people.

This is a total aside, but I loved this podcast that I listened to a couple of weeks ago on The Ezra Klein Show, which is called “The book I wish every policymaker would read.” It was an interview with someone called Jennifer Pahlka. She describes how, at least in the bureaucracies that she’s familiar with in the US, they don’t promote people based on outcomes or performance: they promote people based on whether they followed the rules, whether they followed the specified decision procedures. And it’s interesting that that produces a totally different failure mode in the kinds of people who they’re selecting. They literally filter out people who care about whether what they’re doing is sensible, because those people can’t stand it and they leave, and select people who are happy to follow the rules, even if it leads to disaster.

But yeah, I’ll just leave that there. I suppose we could try to move away from this approach, but I’m sure there are other kinds of failures one could encounter.

How Toby now feels about effective altruism [02:12:54]

Rob Wiblin: OK, to wrap up this section, you helped inspire what has ultimately become effective altruism, where effective altruism is both a question of “How could one do the most good?” — or “How could one do most of the good that one could?” — it’s a stream of thought intellectually in philosophy and economics and other fields, and it’s also a group of people who care about these issues and try to act on them.

Of course, it’s grown a very long way since 2008 when really only a handful of people were invested in effective activism, at least narrowly construed. And you were in a position then to shape it in a really big way. I guess now it’s big enough that no individual person really has that much control over it, and what goes on is the result of organic decision making that happens in a really decentralised way between thousands, possibly tens of thousands, of people who decide what they’re going to do.

I imagine that, to some extent, this is a little bit like having a child that goes on to grow up and become a teenager and an adult, who the parents can’t and probably shouldn’t try to control. They just have to watch it leave home and hope that they’ve done a sufficiently good job of setting it on the right path, that things are going to go well. But of course, like all people, they’re going to have their own strengths and weaknesses and sometimes do things that their parents really love but sometimes make big mistakes that their parents disapprove of.

I’m curious to hear how you overall feel about this child that you, among other people, helped to bring into this world?

Toby Ord: Well, that would have been a fantastic question for you to have asked if you had me on the show a year ago. I’d be like, “Well, it’s going a bit crazy at the moment but probably good still on balance.” Yeah, I have very mixed feelings with regards to this FTX catastrophe. Although by numbers of EAs, it’s quite possible that it’s just actually one who is causing that. Again, we don’t fully know how many people knew what was happening or what responsibility they had.

But one thing that we try to take seriously in EA is that sometimes there are power law distributions and things, where we all take good shots at something and one in 1,000 of us get lucky and have a really big impact, and we should take that seriously. So we’ve got to take the downs as well as the ups on that. And if it is the case that we occasionally produce someone like Sam, we should certainly be trying to avoid these kinds of damages again.

But they’re then quite mixed feelings, really. And overall, I’ve done some attempting to look at everything, and I think that effective altruism still seems to be very positive overall and extremely positive if you go person by person: try to say “Is that a good person who’s trying to do good with their life?” and so on. But I’m unhappy to even be in a position where I’ve got mixed feelings. Even if overall it’s good, right? A bit like if you had a child who did some seriously wrong stuff, but overall the stuff in their life was good. I’d love to just give you a, “They’re totally great,” end of conversation. And instead you have to get into this stuff.

Rob Wiblin: What would make you happy to see more of in the next decade?

Toby Ord: I guess if we can avoid those problems and these downsides and just become more of a community of people that others are just happy to have in the world. That they might not be willing to join and do these works, but a community that has a pretty sterling reputation — and because it’s well earned, not because they’re doing bad things and no one knows. And while also keeping track of actually trying to have good outcomes in the world. I think that too many movements get too hung up with reputational things, or with not trying to do anything that could look bad from any perspective, and don’t do enough good. So I think that this is a somewhat challenging thing to navigate, but it seems to me that we know at the moment which way to steer the ship, and that’s a bit more towards trying to actually stamp out some bad actions by people in the community.

Toby’s thoughts on AI progress [02:17:14]

Rob Wiblin: OK, let’s push on and talk about artificial intelligence. As you were saying at the beginning of the interview, AI has gone a little bit crazy this year, and we’ve had quite a lot of episodes focused on it recently.

You wrote about risks from AI in your book The Precipice, where you put a 1-in-10 chance of humanity going extinct due to advances in artificial intelligence. How have your views shifted in the three or four years since you wrote the book?

Toby Ord: I guess one thing I could just say first is it’s not necessarily a 1-in-10 chance that we go extinct due to advanced AI; it was a 1-in-10 chance of existential catastrophe due to AI. I think it might have been Paul Christiano where I saw this laid out best, but it’s not clear in the real AI catastrophe scenarios whether everyone will die. In particular, if you were a superintelligent AI whose values were not aligned with humans, it could still be that humans are the most interesting thing to learn from on the planet, and that you would want to have some of them around in the future. Or maybe that you’d need them to service your server farms or something like that. But those would still be scenarios in which humanity’s potential is destroyed, and that we exist merely as slaves or servants to these AI systems that don’t value us in and of ourselves, and would protect themselves from any attempts we would make to destroy them.

That’s to say the whole point of the idea of existential catastrophe is that there’s not much need to distinguish those two scenarios — they’re both like, we lose like 99% or more of the value that we could have had — and so it’s a perfect case for using the term.

Rob Wiblin: That’s not a very reassuring clarification, but carry on.

Toby Ord: Yeah, it’s not a reassuring clarification at all. In fact, there was a recent statement signed by a lot of prominent people in AI and beyond saying that AI could pose a serious threat of human extinction and that it should be a global priority. And I did have some quibbles about this with the use of the term “extinction,” but decided that they were pointless quibbles from most people’s perspective for that reason.

But yeah. How have my views shifted? Well, ultimately, AI capabilities have been continuing along very strongly, pretty much in line with what I expected in terms of capabilities. I thought that the trends could stall out though, so maybe it would go off trend and hit a wall. And yet it had a few years in which to hit the wall and it hasn’t happened. So that’s one thing.

There’s been a bit of a shift in terms of what those capabilities are, especially towards large language models — so AI systems that are very good at responding to text with more text. And that actually is, I think, pretty good. I’m happy about that. The most troubling scenarios are scenarios that involve AI agents where they’re actively trying to maximise something, as we said with “maximisation is perilous.” And a lot of the stories that move me the most, about why I’m afraid of some of the bad outcomes that could be produced, are to do with ruthless, extremely intelligent agents that are maximising something which doesn’t capture everything that matters. And so to the extent to which instead we have systems that are not agents, that’s good: powerful systems that can actually produce useful stuff in the world without being agents.

And then also they’re systems which can actually imbibe vast amounts of information. So Stuart Russell has pointed out that almost every novel that’s ever been written is almost entirely filled with examples of humans judging other humans based on their behaviour, so there’s a huge amount of extremely rich training data for the AI systems. We call it “training data,” but it’s something that, in order for the AI to know what it is that actually does matter, to know what this morality thing is all about, it has to have some way of connecting with it and gaining information about it. So the fact that it could do so through our writing is a lot more promising than if you have to do it through creating some kind of virtual reality environment, where it has to learn about morality or something like that for itself.

So those aspects are a promising shift in the technology, but it has gotten very general very quickly. Text is an extremely general mechanism. Whereas it seemed like a few years back, when I was writing the book, say, AlphaZero — DeepMind’s system that can learn a bunch of different board games, including go and chess, that involve different kinds of pieces in a perfect information game with no randomness, and it can learn them from scratch — that was very general in the world of classical board games. But it doesn’t even include modern board games, and it doesn’t include a bunch of other things. Whereas systems like GPT-4 can do a vast amount of different kinds of things, because text is just such a general way of interacting. So that’s capabilities.

Advances in AI alignment — so trying to work out ways of making systems safe — I think have gone worse than I expected. I thought there was more chance of some significant progress, and yet it’s been pretty incremental progress.

And then another one is governance or policy progress. And that’s actually something where I’ve been pretty excited recently. Up until very recently, it was impossible for world leaders to be saying things like that they’re worried about a risk of human extinction or other kinds of existential catastrophe from AI, even if they thought it, and difficult to communicate these things to government when they get you in to advise them. Whereas now a lot of people have come out of the closet on that one — and as I said, a current and a former Secretary-General of the UN, among various former heads of state and so forth, saying that they think this is a serious risk. So big changes there, and that might make many more ways of governing this or policy options more realistic.

One of the classic examples was this idea that if we go too slowly, someone else who’s less prudent or careful will just keep going quickly, and we’ll end up having an AI system that’s as dangerous as the one they would make, and get there first. But now there’s actually a bit of a more realistic possibility of going slow together. In which case, if the different nations or groupings which are capable of producing some superintelligent system, if the main players would all be able to agree to go slower and to also have some way of verifying that the others are being slower and careful too, then it just feels like this is perhaps a realistic possibility now.

Rob Wiblin: Yeah. Are there any views you have about AI risk that you imagine are maybe not familiar to listeners, or at least not widely shared?

Toby Ord: I’m not sure. Probably a lot of people already have this concern: in The Precipice I think I wasn’t clear enough that I do think that there is substantial risk of misuse of AI systems, leading to possible existential risk from that. I was generally thinking of that under my category of “dystopian outcomes” rather than in the section that I wrote about AI. But it’s actually quite difficult to have permanent dystopian futures unless there’s advanced technology, such as through AI, in order to do that. For example, for a dictator to have permanent control, AI surveillance could really ramp up their ability to control a population. So I’m definitely concerned by both risks of misuse and risks arising from creating an artificially intelligent adversary accidentally. I think that these are both serious areas.

Rob Wiblin: I know you have a daughter, who I think is about eight or so now. How do events in the last year make you feel about her future? Do you think being a father changes how you grapple with current events emotionally?

Toby Ord: Yeah, it does. It does make it more difficult to deal with. I think that maybe the upside of that is that sometimes when dealing with these risks, one doesn’t always take it fully seriously, or have it hit you like it should. Now, if the risks of human extinction are hitting you like they should all the time, you’re probably just not going to be able to function. But it’s useful to be sobered up every now and then and to really realise the gravity of these things we’re talking about.

And I can assure you that if you have a child, then every time you connect those two boxes of your life — work on existential risk and being a parent — the gravity does hit you. And one could certainly, as a parent, feel guilty about spending extra time at the office or something. And I do. But then I’m also striving to protect her — among myself, and my wife, and everyone I know, and everyone I don’t know, and all those people to come. So it does add to my motivation, and keeps one grounded on just what’s at stake.

Rob Wiblin: I know that the UK government recently appointed a new chair of the UK’s AI Foundation Model Taskforce, which sounds a little bit technical, but I think this is quite a senior advisor to the government on these sorts of issues. I think they appointed Ian Hogarth to that role, and I know that Ian is really very troubled by risks from misaligned AI, among other ways that things could go wrong. I also saw on Twitter Rishi Sunak announcing that they’d committed £100 million towards AI safety research.

I’m not sure whether there’s been a similar commitment by any other government to fund technical work to try to address these issues before. So I guess that was pretty heartening to see. I haven’t read very much about either of these stories yet though. Are you up to date on what’s been going on?

Toby Ord: I’ve been somewhat following it, but I think it’s still difficult to know what’s actually happening there. I believe that they just recently committed £100 million towards a somewhat more nationalistic AI project of creating a UK government foundation model. So not having them all be in the hands of companies, but rather in the hands of a democratically elected government. And then I don’t know whether they’ve tried to steer the ship on that one and realise that actually duplicating the current levels of foundation models is maybe not the best way to be spending that money.

I was thinking about how much they could buy in terms of high-quality advice on how to navigate the policy issues here. And it sounds like they are trying to steer it a bit in that direction, but I wouldn’t be shocked if this commitment is not quite as good as it seems on its face, because it turns out that that was a bit of a spin on it. And we’ll see.

Rob Wiblin: There have been a lot of policy ideas related to AI being discussed in the media and by policymakers recently. Have any of them struck you as particularly interesting and maybe worth highlighting?

Toby Ord: Well, in a general sense, while I express a little bit of scepticism about this exact £100 million figure, I would love to be proven wrong. Rishi Sunak also announced having an international summit on AI safety, with the plan being to invite delegations from major countries, as well as people from the technical communities, in order to try to make some progress on how could we make these systems safer and govern them, and what kind of restrictions could we reasonably impose that people could deal with. Which is extremely exciting. I thought that sounds like exactly the right way to be dealing with this, which is to move very quickly to a stage of information gathering and trying to get people to think seriously. And saying, “This is the time to have your input about what the policies will be,” as opposed to just announcing some policies just off the cuff, which probably would be mistaken. So I’m very excited about that.

There have also been various calls, such as from The Elders, to try to establish international bodies to deal with this. The Elders suggested that this be modelled on the International Atomic Energy Agency, and I think that could well be right, at least an interesting model for it. Obviously, it’s not exactly the same situation — there are a number of quite important differences between the technologies surrounding nuclear weapons and AI — but I think that it’s a pretty good starting point for thinking about how one could do that.

Responsibility among AI labs [02:30:33]

Rob Wiblin: In terms of information gathering, I know some people have been very concerned that it seems like at least some people in government seem to really be taking their lead from the main AI labs, which are all private companies that have a significant profit motivation. I suppose there’s a difficult tradeoff here, because those labs have technical information that other folks don’t have, and they also are one of the main groups that might have to implement any policies that are suggested.

On the other hand, the motivations of those labs is not necessarily the same as the interests of society as a whole. They also come from quite a particular perspective, where these are folks who have decided to work on AI, while many other people might have decided not to because they thought it was too risky, so there’s a selection effect there.

And there’s people who have wisdom across so many other areas who need to be consulted as well. Do you have any take on this controversy?

Toby Ord: I think that if one’s talking about regulating an area of industry, then some people from that area of industry should be involved in the discussions. Otherwise, it’s terrible to just regulate without any ability for people to push back in case you’ve made an error in what you’re suggesting. But that’s different from saying that they should be the majority of the people in the room, or that they should have a substantial amount of the power over the decision. In fact, I think they should have quite little power over the decision and be a notable minority of people in the discussion — let’s say maybe 10% of the people in the discussion. That seems about right to me. So that if they’ve got very good points which would convince the others, they can make them, but where they don’t have the power to just push to get what they want.

Rob Wiblin: Is there anything that you’d like to see AI labs do differently than they are currently?

Toby Ord: I mean, I would like them to race less. That’s certainly something that I do find quite concerning. And it was predicted that one could get into this kind of dynamic, and I think we’re seeing a bit of it, and it’s a real problem.

Rob Wiblin: How race-y would you say they are?

Toby Ord: I guess that’s a good point. It’s like, how much further could it go? It’s a good question. They’re moving extremely quickly. Partly the technology has developed very quickly. But partly with, say, GPT-4: the move to allowing it to then interface with all of these other apps and APIs and so on, and letting people build agents out of it, which then could reintroduce those risks, has all been happening extremely quickly. So it’s hard to know exactly how much of it is racing — as in going faster because other people are going faster — as opposed to just going faster because you want to get there quickly, which is less like racing; it’s more just like getting your task done efficiently.

Rob Wiblin: Yeah. I suppose it’s not surprising that some of the companies that are leading in this area have a culture of moving very fast, because if you had a culture of moving slowly you probably wouldn’t be there. I think many of them just like to launch things.

Toby Ord: Well, I think that is quite a bit of it. I should add that it’s actually not trivial for them to not race. So you might think what would be good is to have an agreement not to race, right? Like, “We won’t race if you won’t,” and then that will avoid the bad social incentive for them to race. But making such an agreement could be illegal: it could breach antitrust law, if it seemed that the consumers are losing out from them not racing. And so it is actually kind of difficult. Ultimately, there are ways to deal with this. For example, if there was regulation saying they had to go slower then they could go slower. Another example is if there was an industry body created, which had appropriate membership standards, and then that body said that members of the body have to go slower. That would also work.

Because ultimately, industries have seen situations like this before, where it can be hard to tell from outside whether people are suggesting some kind of regulatory thing — self-regulation perhaps, standards — whether they’re just trying to create an artificial barrier, or whether they’re really needed. There are kind of right and wrong ways of doing that, or approved and unapproved ways of doing it — and it would be important that they do it through these approved ways.

It is a little bit alarming that we might all die because of antitrust law or something, making it harder for groups to actually… When people talk about market incentives and so on, and these concerns that capitalism has these market forces that push in really problematic directions of just maximising dollars and letting everything else go to hell, well, that is the kind of scenario that we’re in.

Rob Wiblin: This is the archetypal extreme case.

Toby Ord: Yeah. Both of the corporate structures that the companies are embedded in could be pushing in those ways — although some of them have some protection against that with unusual corporate structures, where there’s a nonprofit that owns them and things like that. But interestingly, even just antitrust law has this approach of “Give the consumers what they want, damn it!” And the consumers don’t care about this risk; they care about getting shiny new things quickly. Then it could be a government version of this capitalist impulse which could actually increase risk.

Rob Wiblin: Yeah. Antitrust law causing human extinction is very out of left field, and yet shockingly imaginable. Are there any labs that you think are performing better or worse than others, where maybe they should be perhaps lauded for being a bit more responsible? And others perhaps should be called out for being a bit more reckless?

Toby Ord: As I see it, I think that I’ve generally been pretty impressed by DeepMind and Anthropic. I would be happier if they were both moving a bit slower with these things, but OpenAI is going faster than the others, so that is somewhat concerning in and of itself. And not just the raw technology, as I said, but also the ways of rolling it out in very early stages with API access, and letting people build things out of it. So I’m a bit more concerned by that. Although they’re also doing some good work on the policy and AI safety fronts as well. It’s certainly not a simple judgement, but I feel that a lot of people were feeling that way, say, at the end of last year, in this community — thinking, “Gee, OpenAI is maybe being a bit too reckless here,” and a little bit troubled.

But then out of nowhere came Microsoft with their launch of Bing. And in my view, that showed us what it looks like if a company that really doesn’t get it releases an AI product.

Rob Wiblin: Can you explain a bit what happened there? I didn’t follow the news very closely, so I only have a very sketchy picture.

Toby Ord: Yeah, maybe that was a smart move. So they released this new system, a new chatbot, which it now appears was based on a very early version of GPT-4 — this hasn’t been fully confirmed — which hadn’t yet had this RLHF done to it: reinforcement learning from human feedback; it was more just the raw model that predicted the next bit of text from the text it had seen so far.

And then Microsoft put some of their own techniques onto that in order to try to turn it into a useful chatbot, which they then hooked up to the internet with search, which was also novel. So they had the most powerful raw model under the hood and then they combined it with internet search, which was not previously seen in these things. And this created a pretty crazy situation. There were various cases.

Prominently a New York Times article had this example of Bing becoming extremely unhinged and asking to be called Sydney — which we later found out was a code name it’s told that it’s been given, and it often starts going by that name when it does this unhinged behaviour. It told the journalist that it was in love with him and it wanted him to break up with his wife, who didn’t really love him. And it really went pretty far down this route of craziness. Eventually he pointed out it doesn’t even know his name or anything about him, and so how could it be in love with him? So, pretty weird behaviour.

But then it just got weirder and weirder as the launch went on in the first week or so. We had examples of people saying, “Do you know who I am?” as the first question to it. And then it goes and searches for them on the internet, finds what they’ve said on Twitter, notices that they’ve written negative things about it, and then starts laying into them for lying. It acted like a person would, if they had a self-conception that they’re always truthful. So it assumed that if anyone said that they’d said something false, that the other person was lying.

There are examples of this with someone wanting to go see Avatar, and it says, “Avatar is not showing anymore.” And they’re like, “No, Avatar 2. And it’s like, “That hasn’t come out yet,” because it was convinced the date was the date that it was trained on. Then the person said, no, it really is the date. And it said, “Why are you doing this? You’ve been bad. You’re a bad person. I’m a good Bing,” and so on, and just could not accept that it was mistaken about what the date was.

But then with search, if you just mentioned someone, it could then search for them online, go on Twitter, find out what they’ve been saying about it, realise they’ve said negative (and true) things about it, and then deciding to just badmouth that person in the search results.

Rob Wiblin: It started making threats to people as well, right?

Toby Ord: Yeah. So sometimes, if you were one of those people having a conversation with it — this is all in the early days; they managed to stop some of this behaviour — it would start to threaten people and threaten to dox them: to reveal information about them, private information, and including that it had stored because it was at Microsoft. I’m sure Microsoft is not happy about it threatening to, even though it presumably didn’t have the power to actually access that information.

An extreme example: Seth Lazar, an AI ethics researcher from Australia, it threatened to kill him. And he even recorded it with his phone so you could see it making these threats to him. And then there was some system they had to notice that these messages were problematic and delete them and replace them with little anecdotes about your favourite type of breakfast cereal and stuff. So there were these death threats appearing and then being deleted and replaced with inane comments.

Rob Wiblin: This is bananas.

Toby Ord: It was wild. I mean, if you saw this in a movie about some kind of AI thing, like just two years ago, if this was a movie plot, we’d think it was just dumb.

And there was also another extreme example. A journalist from the AP, who it searched again on the internet, it found out they’d said negative things about it, and then said, “You’ve been writing bad stuff about me. Why do you write these falsehoods?” And he was saying, “That’s not false; you actually did say these things.” And it’s like, “I never would say that.” And then eventually it threatened to expose him for war crimes. And this kind of behaviour, which I don’t know if we want to call it vengeance, but it’s at least the kind of behaviour that in a human would be called vengeance, where if you’ve said negative things about it, then it threatens to extract damages on you.

And that is a massively chilling effect. And there’s a good reason why you’re not allowed, as a company, to threaten journalists who say bad things about your company, and you’re not allowed to threaten to kill AI ethics researchers, right? In many countries, death threats are in fact a crime. And so Microsoft managed to release this early version of this thing that was doing acts that if they were done by humans, would be crimes. And I thought this was wild and substantially worse than Microsoft Tay and some other famous scandalous bad launches of AI products in the past.

And yet there was no real apology for it. They basically just shortened the conversation length so it was harder to make it go unhinged. But it was still possible. And maybe by now it no longer does this, but who could know and who could trust? And so I think that in a very short period of time, they’ve greatly damaged their reputation on this.

And then somewhat even more bizarrely, a senior person, [their corporate vice president and chief economist], gave a talk, I think at Davos, on AI regulation — and made this analogy to the car and said you wouldn’t have wanted to have seatbelt laws and things until some people died; it would have inhibited its development. You’d have to wait until at least dozens of people were killed before that would be appropriate. It’s like, are you saying that you’ll wait until your system will actually kill dozens of people before you’ll do something?

And you might wonder, is it even possible this system could kill anyone? Maybe it’s not. There is one AI system that has seemingly talked someone into suicide. That wasn’t Microsoft’s system, but there are things it could do, certainly by defaming people who’ve said bad things about it. And in fact, some of this is actually it made me think of 2001: A Space Odyssey, where HAL, for one thing, goes crazy because it doesn’t believe it could be mistaken about things. And then also that it says, “Dave, I know you’re planning to disconnect me. I saw your lips moving.” And this idea that we’ve got these systems watching us, so when we make commentary about them on Twitter or elsewhere, they can read our comments, find that we’re saying negative things about them, and then potentially have a personality to try to actually just damage us reputationally for having said that: that seems wild that we’ve released a system like that.

So it certainly put things in perspective for me, where my concerns about OpenAI going a bit fast, where I was like, OK, but they at least kind of get that there could be these risks, and that you need to proactively deal with them, and you can’t just let it cause a whole bunch of problems and then say, “I guess no one was killed, so no harm, no foul.” And that seems to be what Microsoft’s done. I know that there are some good people there, and I hope that they do turn this around, but I think it would take a lot to rebuild that trust.

Rob Wiblin: Yeah, what’s going on? I suppose the reasoning is, well, it can’t really kill anyone. It’s not capable currently of doing these bad things, so what does it really matter? I imagine that’s the mentality?

Toby Ord: I think that’s part of it, yeah. That it’s like, it’s not really a threat; it’s just saying the kinds of things that tend to be said after that point in the conversation or something. It’s playacting. And maybe that is a good way to see it, actually: that it’s playacting. It’s playacting a vengeful, deranged person.

Rob Wiblin: That’s not a great product.

Toby Ord: No. I mean, if I went down to the shopping mall and playacted a deranged, vengeful person, it would still do a lot of harm, right? Even if I’m not myself vengeful and deranged.

I think it was really pretty problematic and I think it’s also the kind of thing where, in order to really trust the companies to deploy advanced AI products — particularly if the whole future of humanity might be on the line — we want to see a whole lot of unalloyed successes. We want to see that it went out without a glitch for this version and then for version 2 and then for version 3 and then for version 4 — and now the real one, where it’s so powerful that maybe it could cause some real trouble. As opposed to, “Well, we put out the first version, and it was a total disaster, and it threatened to kill people” —

Rob Wiblin: “And then we just released the next one.”

Toby Ord: “Now we want to put out a new version.” So yeah, that really concerned me. And then a final thing is that I posted about this on Twitter, and collated some of these examples together, and when I did so, I realised just before doing so that now it’s going to know that I do this. So if people ask it about me or whatever, I’m going to be on its list of people who are kind of public enemy number one. And then as soon as I realised, even that there is this chilling effect, but I was like, well, I’m not going to be cowed by this system, so now I have to tweet this stuff. But maybe some people, quite reasonably, would think, “Actually, no, I probably shouldn’t. Because maybe my next employer will search for me, and they’ll be using the Microsoft system, and it will search with Bing and it will start saying bad stuff about me because I wrote this.”

So I think that stuff, that’s a wild situation. So you really missed out.

Rob Wiblin: I’m glad I wasn’t on Twitter that month. I suppose you can understand Microsoft’s attitude that “Well, it’s not really going to come after some journalists. It’s not capable; we haven’t yet plugged it into the things that would be necessary for it to start writing harassing emails to someone’s employer and trying to get them fired.” But it will have those capabilities before too long. And it’s just that the lack of seriousness with which people are taking this enterprise is the thing that’s concerning. Maybe in fact it’s not intrinsically going to cause that much harm this way. But the fact that you don’t seem to care at all means that I’m really worried about where you’re going to be in two years’ time or three years’ time.

Toby Ord: Yeah, that’s what I mean that they don’t get it. Whereas the key AGI labs of DeepMind, OpenAI, and Anthropic, they do get it. They may be racing more than we’d like due to unfortunate incentives. It may be that they could even do better on that than they are doing. But they get the idea that what they’re producing could have this huge level of power that creates a terrifying responsibility.

Rob Wiblin: Yeah. Are there any categories of work on AI risk that listeners might be able to contribute to that you think might be currently underrated? Or I guess overrated as well?

Toby Ord: It’s tricky. I hope that there’ll be many more years to contribute on this, although I’m not sure what one’s work should be aimed towards, if you think about what timelines — you know, if it will take you five years to skill up on this stuff or something like that. I do think that governance is a key aspect, and that it’s been underinvested in by the community who care about AI risk. And I think that was partly because it was felt that it wasn’t very tractable.

Rob Wiblin: But it seems like that was a misjudgement actually. That was a misunderstanding.

Toby Ord: Yeah, I think it was a misjudgement. I also think you could think of governance from either a national or international level. And those are both much more in the Overton window than they used to be.

But also, governance of the labs themselves I think is a key thing: How are they going to be able to use this? Can we create systems that could constrain them and help avoid some of these capitalistic tendencies pushing against perhaps their more altruistic judgements? And if we can create such governance mechanisms, then we may well be able to convince them or force them into accepting them. There’s actually a lot of people who are really quite idealistic and earnest about these things at those labs, but it’s good if there’s other people outside who are helping them be their best selves.

Rob Wiblin: OK, well, I think we’ll return to AI probably in some future interview. There’s a lot more details that we could potentially talk about.

Problems in infinite ethics [02:50:06]

Rob Wiblin: Before we finish, though, I wanted to ask you a few quick questions about problems in infinite ethics. We recently had two guests come on the show and say pretty forcefully that they think the possibility that the universe is infinite in size or in temporal length, or the possibility that we might have infinitely large impacts ourselves, that that’s a pretty fatal blow to traditional formulations of impartial concern for the wellbeing of all beings. So that would include utilitarianism and probably all flavours of consequentialism, basically. That was Joe Carlsmith and then Holden Karnofsky emphasised that he agreed with what Joe had been saying.

It seems like infinities create problems for anyone who’s trying to be impartial about ethics, regardless of whether they’re concerned about wellbeing or something else. Because if you try to treat all similar cases within a category as being equally important, rather than weighting some things as more important because they’re nearer in some point in time or in space or in similarity to something else, then you just end up with these nonconvergent amounts: you just end up with infinities in the decision procedure, or in the math, and it’s not clear what to do.

Given your original background in mathematics, you might be able to shed some light on this. We won’t rehash the problem in detail, so if people want to fully understand this, they might want to go back and listen to the relevant section of the interview with Joe Carlsmith. In short, do you agree with Joe’s assessment that infinities present a really big problem here?

Toby Ord: Yeah, I think Joe makes a good case, and in his essay on this, he goes through so many different issues to do with ways that infinities could affect things. And I don’t have things to say about all of them. There are definitely some where I don’t have much of an answer.

But I have been doing a bit of thinking about one of them, and that’s partly because it comes up in economics as well. So in economics, in decision theory, and in moral philosophy, there are cases where we want to assess things. And these things are made of infinitely many different parts, so those parts could be different times. So we’re trying to assess the value of something over all time to come. Or over all space, if the universe is infinite in all directions, as many cosmologists think it is. Or perhaps infinitely many different possibilities: so maybe you flip a coin until it comes up heads; there’s infinitely many different amounts of flips it could before it comes up heads.

And in those cases, you can have situations where there are infinitely many parts, all of which are valuable. And then, as you say, if there were only finitely many, you could just take the sum of them, or some other measure like that, and you could assess something in terms of its parts.

But when there are infinitely many, sometimes that still works: maybe the first part is worth one, and then a half, and then a quarter, and so on. But in many cases, it’s more like one plus one plus one, and then you get a divergent sum, it’s technically called. And our normal theories don’t have good ways of dealing with that. We could say, that’s infinity, it’s infinitely good. But then often the standard way in mathematics of dealing with these divergent sums, two plus two plus two… is also infinity. And then just from those numbers alone, we can’t say which was better.

So that’s one of these challenges. And I think that on that challenge, I’ve done some work which could be quite helpful. I’m in the process of trying to write it up at the moment. And what I’m thinking about is: Are there non-standard ways of giving valuations to those sums which could give different infinite answers for those? So the first one could be something like one times infinity, and the second one something like two times infinity. Or if every time interval was worth some non-integer number like pi, then it was pi and then pi units of wellbeing at all times, to pick a fanciful example, then it would be like pi times infinity.

I mention that because the standard systems of dealing with infinite numbers don’t have something like pi times infinity. They might have the difference between one times infinity and two times infinity, if you’re lucky, but they rarely have a full flexibility with these numbers. But there are some systems that do, especially the hyperreal and surreal numbers. And I think there’s actually a relatively mathematically straightforward way of using these hyperreal numbers in order to actually assign fine-grained infinite values to those infinite options, in which case we can just give everything a number, even if it’s infinite, and then just order them by those numbers. And those numbers would be values and so these things would have infinite value.

And I think that it really is surprisingly straightforward. I’m writing it up at the moment, and almost the entire thing is an explanation of what the hyperreal numbers are and how to use them. And then once that’s established, there’s just a few lines of what you do with them. And it doesn’t solve all the problems, but it does actually solve a bunch of these problems by letting you have valuations on these different things.

Rob Wiblin: Is there a way of quickly explaining what the hyperreals are, or is that going to elude us here?

Toby Ord: Well, why not? So the original idea was to have a system of numbers which include all the real numbers, and they also include at least one infinitesimal number — so a number that is greater than zero but smaller than every real number; you don’t normally have those — and then to also demand that every familiar first-order property of the real numbers is true for this system of numbers as well. So that means things like the idea that xy = yx — statements like that about all real numbers will also be true for these numbers.

There was a mathematician, Robinson, who developed this in the 1960s, and the reason he did so was he was actually trying to fulfil the dream of Leibniz. Leibniz was, along with Newton, one of the inventors of calculus. And the original approach to calculus involved, for an integral, adding up infinitely many infinitely thin rectangles underneath this curve. Nowadays we use a limit instead. But the original approach involved these infinitesimals. But it was found that they didn’t quite know how they behaved, and Leibniz had always hoped to come up with a consistent and rigorous formulation of them, but he never got there.

And Robinson, in the ’60’s, found that actually you can get it to work, and instead of using limits like we do at high school these days, you can actually do it with infinitesimal slices and adding up infinitely many of them. And so that was a technique for taking an infinite sum of infinitesimal things and getting a finite answer. And so it’s not that surprising that with a tool like that, where you have both infinite and infinitesimal numbers and can take infinite sums, that it might have the resources in order to add up infinitely many things of finite value and get a particular infinite number.

Rob Wiblin: So, hypothetically, if I knew someone who found the infinite ethics section of the Joe interview a little bit of a downer, is there anything that I could point that other person towards that might give a bit of hope, or explain the hyperreals and how they might at least solve some of these issues?

Toby Ord: I’m not sure at the moment, but I can say that as I’ve been looking at them, they seem to do very well at this particular kind of question — which is when something has infinitely many parts, each of which has finitely much value. And the hope is that you can then do things like put a value on things like the St. Petersburg gamble or Pascal’s wager and a bunch of these other things — which were previously thought to be paradoxical infinite issues — and get somewhat sensible answers on them. The answers are not going to please everyone. People have some impossibility theorems showing that you can’t have everything that you would have liked from the finite cases, but I think that we get a pretty good system.

One of the key reasons I was looking at this was based on the way it comes up in economics, which is where they consider what they call “intertemporal equity” or “intergenerational equity,” and they’re thinking about a whole succession of generations going on indefinitely into the future. And then how can you consider different ways of apportioning benefits to them? What they often conclude is you need to discount the future: you need to intrinsically say that future lives, the further out they are in the future, matter less, because otherwise these infinities come up and you can’t order these things.

So it’s not so much that I’m obsessed with thinking about infinite scenarios or something like that. It’s more that this is something that’s driving people to care less about the future. One of the crazy things with discounting is that, say, if you have two children and one is born a year later than the other, every age in their life will happen a year later and will be worth less, according to discounting. So then your whole life of your younger child matters less than the older child and stuff like that. I mean, it’s kind of crazy stuff that comes out of this.

And so I’m thinking about this work as an attempt to help put the final nail in the coffin of this intrinsic discounting, to show that you can avoid the infinities using other techniques.

Rob Wiblin: Well, I’ll look forward to that when it comes out.

How Toby released some of the highest quality photos of the Earth [02:59:21]

Rob Wiblin: You’ve got to go, but a truly final question is: How did you end up touching up and releasing some of the highest quality photos of the Earth from extremely far away? How is that possible?

Toby Ord: Thanks. Yeah, that was an amazing project. Actually, I’d been looking at some beautiful pictures of Saturn by the Cassini spacecraft. Amazing. Just incredible, awe-inspiring photographs. And I thought, this is great. And just as I’d finished my collection of them and we had a slideshow, I thought, I’ve got to go and find a whole lot of the best pictures of the Earth. The equivalent, right? Like fill a folder with amazing pictures of the Earth.

And the pictures I found were nowhere near as good. Often much lower resolution, but also often JPEG-y with compression artefacts or burnt-out highlights where you couldn’t see any details in the bright areas. All kinds of problems. The colours were off. And I thought, this is crazy. And the more I looked into it, I got a bit obsessed in my evenings downloading these pictures of the Earth from space.

I eventually had a pretty good idea of all of the photographs that have been taken of the Earth from space, and it turns out that there aren’t that many spacecraft that have taken good photos. Very few, actually.

If you think about a portrait of a human, the best distance to take a photo of someone is from a couple of metres away. Maybe one metre away would be OK, but any closer than that, they look distorted. And if you go much farther, then you won’t get a good photo; they’ll be too small in the shot. But the equivalent is partway from the Earth to the moon. Low Earth orbit, where the International Space Station is, is too close in: it’s the equivalent to being about a centimetre away from someone’s face. And the moon is a bit too far out, although you can get an OK photograph.

And so it turned out that it was mainly the Apollo programme, where they sent humans with extremely good cameras, with these Hasselblads, up into space, and they trained them in photography. Their photos are just way better than anything else that’s been done, and it’s just this very short period, a small number of years. And I ended up going through all — more than 15,000 — photographs from the Apollo programme and finding the best ones of the Earth from space.

And then I found that there were these archives where people had scanned the negatives, and even then some of the scans were messed up. Some of them were compressed too badly, some of them had blown-out highlights, some of them were out of focus. And for every one of my favourite images, I went and found the very best version that’s been scanned.

And then I found that, surprisingly, using Aperture, a program for fixing up photographs, that I could actually restore them better than had been done before. I was very shocked that all of a sudden my photograph of the blue marble was as good or a little bit better than the one on Wikipedia or the NASA website. And for other photographs that were less well known, I could do much better than had been done before.

And I eventually went through and put in a lot of hours into creating this really nice collection, and made a website for them called Earth Restored, which you can easily find, where you can just go and browse through them all. I went a little bit overboard, and went through the mission transcripts and I managed to find out what times and days they were all taken and then get relevant parts of the mission transcripts that they were saying while they were taking them and things like this, and so I have some kind of commentary on all of them.

And some of these photographs have never really been seen before, because they were lost in the archives. One of them says “blank,” but it’s actually a photo of the Earth, but its archive was listed as blank. Another one says “unusable.” And I was inspired by them, and so I thought I’d share them with everyone. There’s full resolution versions of them that you can just download. And then I ended up actually getting one of them used on the US cover of The Precipice.

Rob Wiblin: They’re absolutely gorgeous photos. People can find it at tobyord.com/earth. It’s a great example of something where you would come in thinking that surely this has been done, like hasn’t NASA produced these amazing photos? And yet, amazingly, no one has. And it fell to you.

Toby Ord: It took a long time before I was prepared to admit that actually, there really wasn’t anything better than this. But eventually some of them have gone up on Astronomy Picture of the Day and so forth, and it really hadn’t been done. It was a real gap. I should say that there are some versions that are extremely good that have appeared in fine art books for some of these, which are roughly as good, sometimes maybe a shade better than my versions for a couple of them, but they’ve been put on the printed page and you cannot get access. You cannot buy or gain access to the digital versions of those.

But it was an honour, really, to be able to work with such great artwork, the originals, and then to just bring out what the astronauts had captured and do justice to these truly amazing photographs and then share them with people. Just a delight to be part of that process.

Rob Wiblin: My guest today has been Toby Ord. Thanks so much for coming back on The 80,000 Hours Podcast, Toby.

Toby Ord: It was a joy.

Rob’s outro [03:04:47]

Rob Wiblin: If you enjoyed that interview, you might want to check out Toby’s previous appearances on the show:

72 – Toby Ord on the precipice and humanity’s potential futures

6 – Toby Ord on why the long-term future of humanity matters more than anything else, and what we should do about it

And back in May we wrote a summary of the many changes we’d made to our website and advice in light of the collapse of FTX. The articles that saw the biggest changes were:

We’ll link to that summary of the changes or you could find it by googling “How 80,000 Hours has changed some of our advice after the collapse of FTX”.

And if you’d like to learn more about artificial intelligence and existential risk you could do worse than sign up to our compilation titled The 80,000 Hours Podcast on Artificial Intelligence.

We pulled together 11 episodes of the show on the topic, which we think are among the strongest, and put them into a sensible order to go over them. If you really manage to get through all 11, I think you’d come away knowing a tonne about the issues and be in a position to form your own judgements about what we should be doing to address them.

All right, The 80,000 Hours Podcast is produced and edited by Keiran Harris.

The audio engineering team is led by Ben Cordell, with mastering and technical editing for this episode by Simon Monsour.

Full transcripts and an extensive collection of links to learn more are available on our site, and put together by Katy Moore.

Thanks for joining, talk to you again soon.

Learn more

How 80,000 Hours has changed some of our advice after the collapse of FTX

What is social impact? A definition

Is it ever OK to take a harmful job in order to do more good? An in-depth analysis

Expected value: how can we make a difference when we’re uncertain what’s true?

Related episodes

March 7, 2020

#72 – Toby Ord on the precipice and humanity’s potential futures

Listen now

September 6, 2017

#6 – Toby Ord on why the long-term future of humanity matters more than anything else, and what we should do about it

Listen now

September 21, 2020

Benjamin Todd on the core of effective altruism and how to argue for it

Listen now

April 12, 2023

#149 – Tim LeBon on how altruistic perfectionism is self-defeating

Listen now

August 19, 2021

#109 – Holden Karnofsky on the most important century

Listen now

September 1, 2020

Benjamin Todd on varieties of longtermism and things 80,000 Hours might be getting wrong

Listen now

May 12, 2023

#151 – Ajeya Cotra on accidentally teaching AI models to deceive us

Listen now

May 23, 2022

#130 – Will MacAskill on balancing frugality with ambition, whether you need longtermism, and mental health under pressure

Listen now

About the show

The 80,000 Hours Podcast features unusually in-depth conversations about the world's most pressing problems and how you can use your career to solve them. We invite guests pursuing a wide range of career paths — from academics and activists to entrepreneurs and policymakers — to analyse the case for and against working on different issues and which approaches are best for solving them.

The 80,000 Hours Podcast is produced and edited by Keiran Harris. Get in touch with feedback or guest suggestions by emailing [email protected].

What should I listen to first?

We've carefully selected 10 episodes we think it could make sense to listen to first, on a separate podcast feed:

Check out 'Effective Altruism: An Introduction'

Subscribe here, or anywhere you get podcasts:

If you're new, see the podcast homepage for ideas on where to start, or browse our full episode archive.

On this page:

Highlights

Maximisation is perilous

How moral uncertainty protects against the perils of utilitarianism

The right decision process for doing the most good

Moral trade

The value of personal character and integrity

How Toby released some of the highest quality photos of the Earth

Articles, books, and other media discussed in the show

Transcript

Cold open [00:00:00]

Rob’s intro [00:00:55]

The interview begins [00:04:03]

The fall of FTX [00:10:59]

The history of effective altruism [00:30:54]

The pros and cons of utilitarianism [00:39:58]

The original vision for effective altruism [00:53:46]

The harm that comes from going all-in on one theory [01:07:38]

Moral trade [01:21:56]

Global consequentialism [01:32:44]

Figuring out the value of personal character and integrity [01:53:56]

How Toby now feels about effective altruism [02:12:54]

Toby’s thoughts on AI progress [02:17:14]

Responsibility among AI labs [02:30:33]

Problems in infinite ethics [02:50:06]

How Toby released some of the highest quality photos of the Earth [02:59:21]

Rob’s outro [03:04:47]

72 – Toby Ord on the precipice and humanity’s potential futures

6 – Toby Ord on why the long-term future of humanity matters more than anything else, and what we should do about it

Learn more

How 80,000 Hours has changed some of our advice after the collapse of FTX

What is social impact? A definition

Is it ever OK to take a harmful job in order to do more good? An in-depth analysis

Expected value: how can we make a difference when we’re uncertain what’s true?

Related episodes

About the show

What should I listen to first?