Rob’s intro [00:00:00]
Robert Wiblin: Hi listeners, this is the 80,000 Hours Podcast, where we have unusually in-depth conversations about the world’s most pressing problems and what you can do to solve them. I’m Rob Wiblin, Head of Research at 80,000 Hours.
Today’s guest Owen Cotton-Barratt has been hugely influential in the effective altruism community, and my colleague Arden Koehler – who hosts this episode by herself – was excited to go deeper on some of his core research topics.
They talk about the value of preventing small disasters from becoming large disasters, what most people should do if longtermism is true, and research careers for people interested in coming up with projects to benefit humanity.
This episode is mostly directed towards potential and actual longtermist researchers, as well as people already involved with the effective altruism community.
If you enjoy getting into the weeds on research that’s relevant for the future of humanity, this one might be for you.
But – if you’re new to these topics – it might be better to start elsewhere.
For an introduction to longtermism, I’d recommend episode #6 – Dr Toby Ord on why the long-term future matters more than anything else & what to do about it.
And for an introduction to effective altruism, I’d recommend episode #21 – Holden Karnofsky on times philanthropy transformed the world & Open Phil’s plan to do the same.
“We’ve had a lot of episodes about the effective altruism community lately, which I know many listeners love. But if you’re someone who would like to hear more on other topics, stay tuned because we’ll have a bunch of those in the new year.”
If you’re wondering why this was apparently recorded in person in the middle of a pandemic – that’s because it was actually recorded all the way back in February – before most of us were thinking about COVID-19 at all, which means this is actually the first episode Arden recorded on her own.
We don’t normally hold on to recordings for so long, but we put it on ice first because of COVID, and then so that we could put it out closer to the next recruiting round for Future of Humanity Institute’s Research Scholars Programme, which Owen is heavily involved with.
Owen will describe the Research Scholars Programme more, but in brief it’s a great way for potential researchers interested in the big picture questions affecting the long-term future of humanity to experiment with what it’s like to do that kind of research, and meet people already in that career track.
If that sounds like something you’d value, definitely click through the link in the show notes for more information about the programme. Applications are now set to open in the Spring of 2021 – but Owen thinks it’s helpful for people to know about the RSP and think about it in advance.
Alright, without further ado, I bring you Arden and Owen.
The interview begins [00:02:22]
Arden Koehler: Today, I’m speaking with Owen Cotton-Barratt. Owen is a researcher at Oxford University’s Future of Humanity Institute, focusing on methods for making decisions under great uncertainty about the long-term effects of our actions. Owen is a mathematician, and also Director of the Research Scholars Program, a two year training program at the Future of Humanity Institute for early-career researchers focused on topics that are important for ensuring humanity’s long-term flourishing. He’s also a strategic advisor for the Centre for Effective Altruism, and has been doing work relevant to making decisions under uncertainty, especially related to existential risk, since 2013. He was first interviewed on the show back in episode 28 in 2018. Welcome back to the podcast, Owen.
Owen Cotton-Barratt: Thanks, Arden. Delighted to be here chatting with you and back on the podcast.
Arden Koehler: I hope to talk about strategies for managing risks of human extinction, the question of what everyone can do to help ensure that the long run future goes well, and research careers in mathematics, existential risk, and long-term research. But first, what are you working on right now and why do you think it’s important?
Owen Cotton-Barratt: The big project that I’ve been working on for the last couple of years is the Research Scholars Program. And when I got into the research field around existential risk and around understanding what sensible things we can say about where humanity is headed on time scales of decades, centuries and longer, I thought, “Yeah, this stuff’s pretty important”. I would love it if there were more people who were doing high quality work on this. It’s a bit funny because I didn’t see there being natural roots into that for a lot of people. When I looked around the Future of Humanity Institute, I’d see that most people there had done a PhD in something kind of unrelated. And then afterwards, they pivoted and they said, “Well, now I’m going to move into working on these topics that seem sort of important”.
Owen Cotton-Barratt: And that didn’t feel like the ideal pattern. It kind of seems like if we think this is important, it’d be nice to have ways for people to dive in and say, “Okay, great. What do I need to do to work on this?” I also think that we don’t always know actually, exactly what is important, and that often people end up with more good taste for choosing important things to work on if they’ve had a space where they can step way back and say, “Okay. So what’s really going on here? What is the game? What do I want to be focusing on?” And so the thing we’re trying to do with the Research Scholars Program is create that space for people. And so we are taking in people with a range of disciplinary backgrounds, and some of them have PhDs, and are looking to do something now which isn’t just continuing on the same path they were on.
Owen Cotton-Barratt: Some of them also don’t have PhDs, and maybe they’re thinking of going and getting a PhD in the future, and hoping that they can do something more important already as part of the work of their PhD. Or maybe they’re not even sure if a PhD is the right thing for them, but they think it will be useful to get a bit of space to explore, have other people around them who can provide a supportive environment for thinking about that, and who’ll just provide useful thoughts and pushback and say, “No, I think you’re wrong about this”. And so I’m hoping this will be useful for individuals going through it. I’m also hoping that in the process of running this, I will learn more about how one creates the environments that are useful for people and what’s needed, and think about which elements we could actually bring to many more people.
Extinction risk classification and reduction strategies [00:06:02]
Arden Koehler: Okay. Yeah. Let’s talk more about that when we get to thinking about research careers at the end of the episode. Okay. So you recently wrote a paper with Max Daniel and Anders Sandberg called “Defense in Depth Against Human Extinction: Prevention, Response, Resilience, and Why They All Matter”. And we’ll put a link to that in the show notes. So in the paper, you suggest that it can be really useful to classify risks of human extinction by how they originate, how they scale up, so how they go from a small problem to a really big problem, and how they finish us off, how they can go from a sort of catastrophe to something that actually causes human extinction. Do you want to just walk through the basic ideas of the paper, and what sort of insights you feel like it allows us to have?
Owen Cotton-Barratt: Yeah. Maybe I want to talk first about where the motivation to look into this and write this paper was coming from.
Arden Koehler: Yes, that’s a great idea to start with that.
Owen Cotton-Barratt: And so I think that human extinction would be a pretty big deal, and it seems to me like it would be pretty bad. And I also have various colleagues who think that and have been doing work on different parts of it and trying to understand, “Okay, which things that we can perceive pose the greatest risks of human extinction?”
Arden Koehler: So we actually… By the time this episode comes out, hopefully our episode with Toby Ord will have come out on his book, “The Precipice”. So we will have talked about some of these issues. And of course, we’re talked about human extinction a couple of times on the podcast too. But yeah, Toby is also at the Future of Humanity Institute, so he’s probably one of the colleagues you’re referring to.
Owen Cotton-Barratt: He is, yeah. And I think it is great for people to be doing work looking into some of these specific risks. I have a feeling, and I don’t know how much of this has come out of my training as a mathematician, I’m sure that some of it has, that if there’s an important topic, it’s really good to understand it from as many different angles as you can because you never know when a new angle will shed some light on it. And so I think it is really good to look at the things that we can identify as risks and say, “What’s going on with each of these, and where is it going to be getting us to?”
Arden Koehler: Were these individual risks things like the risk of a global pandemic?
Owen Cotton-Barratt: Yeah. Exactly. Risk of a supervolcano erupting and throwing up enough ash that the sun is blocked out for several years, and we can’t grow enough food. And the approach that we took with this paper was to say, “Okay. Let’s just try and take a few steps back,” and say, “Assume we don’t really know anything about the empirical facts about which things are the biggest risks, and we just want to understand… Well, how could it be that anything could cause human extinction?” And it kind of started with a puzzle, of society tends to amplify things which are good for people and push back and try and stop things which are bad for people. If somebody invents a great new widget, everybody’s like, “Yeah. I want one”. And then maybe not immediately, everybody gets one. But somebody sets up a company, and then they scale up the production and then a few years later, everyone is walking around with these in their back pocket.
Owen Cotton-Barratt: In the case of something which causes real harm to somebody, we try and say, “Let’s not have that happen”. Maybe we legislate to say, “This is a safety issue, really we shouldn’t allow–
Arden Koehler: Lead paint or something.
Owen Cotton-Barratt: Lead paint, right, right. So there’s all of these social structures which push towards recognizing when things are good for people and doing more of them, and recognizing when things are bad for people, and doing less of them. And if we’re looking at hypotheticals where humanity ends up extinct, obviously, this has gone off the rails somewhere. Humanity going extinct is very bad for seven and a half billion individuals. And so there’s a puzzle. How could you get to that end point given what’s going on with the social pressures, which push us to not scale up the kinds of things which are bad for people?
Arden Koehler: Yeah. I think Rob and Will talked about this on the podcast with Will MacAskill as it’s like a really high bar to cause human extinction when everyone in the world is trying to not go extinct.
Owen Cotton-Barratt: Right.
Arden Koehler: And not die themselves, and also would probably prefer that other people do not either.
Owen Cotton-Barratt: Right. And so it’s vanishingly unlikely that, that bar is going to be crossed just by accident by, ‘We have seven and a half billion fluke accidents’, and everybody falls over and bangs their head on the same day and everybody dies.
Arden Koehler: Okay, yeah.
Owen Cotton-Barratt: And if there is going to be something which does this, there’ll presumably be something more structural and systematic, which meant that despite this pressure in society towards not scaling up things which are bad for people, it still happens. And the basis of the paper that I wrote with Max and Anders is to just ask that question a bit more and dig into… Okay, so why did the bad thing get going at all? And what could’ve been the origin of something bad happening be? And then ask: how can it have got to being a large enough scale that it could be affecting billions of people? And then also asking: how could it actually get to everybody because maybe the dynamics have a qualitative change from scaling up from a minority to a majority of the population, to something which is going to move from actually killing a very large number of people to killing off the whole of the species.
Arden Koehler: So I think that there might be a qualitative change there. I mean, intuitively or naively for me, it’s like, well, in both cases, it’s just like more and more people being affected. So what’s the intuition there that there could be a qualitative shift in the dynamics?
Owen Cotton-Barratt: Yeah. So if you think of pandemics, for example, if you have a disease which has a 10% case mortality rate, so 10% of people who get the disease die. Then maybe it can spread to a lot of people, and 10% of all of the people it spreads to dies. So in the scale up phase, it’s easy to see how by infecting more and more people, it’s going to lead to more and more deaths. And it could get up to a very large number of deaths. But even if it infected everybody in the world, it’s not going to kill more than 10% of people. There could be other features. Maybe you could have something bad which happens to everybody who lives in cities. I think more than half of the world’s population live in cities.
Owen Cotton-Barratt: And so if you had something which could kill everybody who lived in cities, that would kill a lot of people. But humans are just spread out, and they’re in all sorts of different circumstances. And some of those might be kind of hard to get to. And if you’re asking for what’s something that can kill everybody, then you need it get into all of those little niches.
Arden Koehler: So one response to this would just be to say, “Okay. Well, you’ve just shown me that something only affects people in cities, and a pandemic that has a mortality rate under 100% can’t kill everybody, or can’t be an extinction risk really; the worst it can do is a lot of damage”. So why’s that not the sensible response?
Owen Cotton-Barratt: Well, I mean, I think that there is something to that response. I think that if you are trying to understand the dynamics of things that could kill everybody, it’s actually useful to notice something like, “Well, things which can only kill people in cities aren’t going to kill everybody”. It doesn’t follow that, therefore, we shouldn’t worry about things that can only kill people in cities. Clearly, there’s a lot of people in cities, and just on the ‘lots of everyday reasons’ we should care a lot about preventing anything which is going to do that.
Arden Koehler: Of course, because it would cause much misery and death and horribleness.
Owen Cotton-Barratt: Right, right. Yeah. And the idea of a pandemic, which kills 10% of the world’s population, I think most people would correctly find that horrifying.
Arden Koehler: Yeah. That’s totally right. I guess my question was, if we’re trying to classify extinction risks in particular, and we discover that a disease has a less than 100% fatality rate, a naïve thing to do would say, “Okay. Well, that’s not an extinction risk”. But it seems like maybe the framework suggests that, no, it can have some other mechanism for becoming something that can actually cause extinction.
Owen Cotton-Barratt: Yeah. I think that’s a question about terminology, actually, where it’s reasonable to call that an extinction risk via other mechanisms, where maybe you think this could lead to the breakdown of a bunch of the structures we have in society, and that might make it easier for something to come along and cause extinction. And if that something else comes along pretty quickly and it just turns out there’s lots of those ‘something elses’, and they’re likely to happen after we have a breakdown in society, then you might want to say that the first thing is meaningfully an extinction risk.
Owen Cotton-Barratt: And I don’t think that we really understand what kind of thing would be needed to knock our civilization off course in a major way, or what coming back from something like that would look like. I also think that you might say, “Okay. No, well, in that case, the pandemic isn’t itself an extinction risk”. It’s some kind of risk of a global catastrophe. And maybe that catastrophe would be a risk factor for subsequent events. And so even if you’re only in the business of caring about human extinction, which I think would be a slightly weird position, you’d still want to care about any kind of catastrophe which could have large enough global impact that it could affect the trajectory of further risks.
Defense layers [00:16:37]
Arden Koehler: Okay. So one thing that was interesting to me about the paper was that one way of conceptualizing these different stages at which a disaster becomes something that can cause human extinction is that you might think there are these different sort of, I think you called them defense layers, where there’s one sort of layer where we try to keep bad things from happening at all. So this might be safety procedures in a lab, where we don’t let the pathogens get out, so there’s no disaster at all. And then there might be other defense mechanisms for society, where we try to keep small disasters from becoming big disasters. And then at least theoretically, we might have defense mechanisms for keeping big disasters from becoming causes of human extinction. So one reason this was interesting to me was that it seemed like we could sort of think about interventions for reducing the likelihood of human extinction at the level of these defense layers.
Arden Koehler: So we might say, “Well, we’ve noticed that there aren’t that many medium and large scale disasters. So our sort of preliminary defense layers must be pretty strong”, maybe because they’re really good incentives for people to not… These sort of ordinary incentives for us to not get hurt, so maybe we should focus on some of these later defense layers, or these things that would keep us from going extinct, even if there was a really large disaster. So one example might be you could think of David Denkenberger’s alternative foods idea as one of these. It’s like, well, even if there was a really huge disaster, here’s a thing we could do that would keep that from turning into an extinction event.
Arden Koehler: And we don’t have as much evidence that those defense layers are strong, so it might suggest that we should go work on those. That was one thing that I thought could be a thing that the paper could sort of bring to light. But I’m not sure if you agree with that.
Owen Cotton-Barratt: Right. I think I basically do agree with that. I also can’t remember whether… This is something that we didn’t say explicitly in the paper, but like you were thinking in response to the paper. Yeah. I guess I kind of feel like this is the type of fruits that you expect from doing research like this, where you say, “Okay. Here’s just trying to give a clear conceptualization of what’s going on”. And then maybe it’ll make it easier for people to think about. And I–
Arden Koehler: Or to think about from these sort of different angles, like you said.
Owen Cotton-Barratt: Right. Right. And it isn’t the case that nobody’s been thinking about that last layer. As you say, David Denkenberger has been doing work on alternate foods because there were some reasonably robust arguments that there’s a whole class of catastrophes, where one of the mechanisms of things being really bad is that a lot of people starve to death because we’re not able to grow crops.
Arden Koehler: So these are things like nuclear winter and other things that block out the sun.
Owen Cotton-Barratt: That’s right.
Arden Koehler: And sort of affect the climate in a lot of ways, maybe climate change could even become like this under certain circumstances.
Owen Cotton-Barratt: Yeah, that’s exactly right. And so he’s saying, “Well, can we do research into other ways that we could get human consumable calories in one of these extreme situations?” And I like the point that you were making that there’s already been lots of incentives pushing people to avoid these medium scale… I mean, I say medium scale, like in this conceptualization, medium scale includes things which are actually very big by our normal understanding of what disasters are. It’s partly also because the world has a reasonably good track record of avoiding disasters which are really big in the sense of maybe hundreds of millions of people dying. And this is a good feature about our world. I kind of want to say, “Okay. Pat humanity on the back on that one”. Or I still think we can do better at avoiding more of these pretty big disasters, and we’d like to keep on pushing them down.
Owen Cotton-Barratt: But we’re not seeing an incidence rate of things getting very big, and what would be needed to try and curtail their growth at that point? And so there’s less prior experience and there’s less established incentives pushing people towards working on that. And so that might mean that there will be lower hanging fruit focusing on that level.
Arden Koehler: Yeah. Although, I mean, in some sense, it’s… So even though this is going against the thing I just said, or I guess the thing I said was that there’s less evidence that we would be good at defending against a medium or large scale catastrophe, you’d think that maybe the incentives should be extremely strong because, well, it’s extremely bad, and it’s worse than the small scale things, so the incentives to prevent it should be bigger.
Owen Cotton-Barratt: So I think that there’s a lot of different things that actually feed into our notion of incentives. And if you’re thinking of incentives of just, “How bad would this be” versus, “What are the chances?”, then I think there are really pretty significant incentives there, which are maybe distributed, because any one person doing work on this isn’t going to capture most of the benefits. Work like this will have a lot of positive externalities, which mean that it may be undersupplied by rational actors in the marketplace.
Arden Koehler: And that’ll be exactly because we’re talking about events that affect lots and lots of people at once.
Owen Cotton-Barratt: Right. And so, I mean, particularly for interventions which would help lots and lots of people at once. If the intervention is “Stockpile food in your own house”, then that wouldn’t necessarily be undersupplied according to this just pure incentive mechanism. Another thing though, which I think feeds into incentives, is just attention. I think that people’s attention is a limited resource, and people will spend attention and take steps towards doing things on events which are salient to them. And if we have a lot of small scale disasters, then it’s going to be salient to people, “Oh, this is a thing that we should be striving to make not happen”. If we never have a big disaster, then there aren’t the occurrences of the big disaster, which make it salient to people.
Owen Cotton-Barratt: And there can still be, maybe we have disaster movies, or science fiction, or theoretical work analyzing it, and so I do think that it gets onto people’s radar, somehow. But it isn’t that most people, most of the time, are going to be attending to it. And so I don’t know how much to trust the argument about incentives.
Arden Koehler: Okay. So that’s one reason why it might not be unreasonable to expect our ways of keeping contained very large disasters to be relatively well developed because maybe it’s just not salient enough to us.
Preventing small disasters from becoming large disasters [00:23:31]
Arden Koehler: So one of the interesting conclusions that you drew, or I guess maybe it’s more accurate to say that the framework sort of makes more obvious, is that if you were sort of only interested in preventing some sort of chain of events from becoming an extinction event, then it’s going to be equally effective to cut in half the probability of that extinction risk getting sort of through each stage. So if you can cut in half the probability of it going from a small disaster to a large or medium sized disaster, that’s as good as cutting in half the probability of it going from medium, to large disaster, to causing human extinction. Do you want to just walk through that argument and say why that is?
Owen Cotton-Barratt: Yeah. I mean, the argument in some sense is extremely simple. The probability that we end up getting to extinction is just equal to that probability that we have a small disaster in the first place, and the conditional probability that it turns into a large disaster, conditional on having a small disaster. So multiply those two together and then multiply also by the conditional probability that it causes extinction conditional on being a large disaster. And so if you half any one of those probabilities, you’ll have the thing that you get when you multiply them all together. And the reason that this may be useful at all is it just gives a rule of thumb for thinking about… Okay, how should we balance our resources between what’s going on at the different layers?
Arden Koehler: So can you think of any examples where this suggests that we should allocate resources in sort of one way rather than another?
Owen Cotton-Barratt: I mean, I am not an expert on the empirics of these different risks, so I’m just going to talk about a stylized example. But one thing that I think can often come up is the idea of diminishing returns from looking into what you can do about something. And so say we were thinking about a risk of terrorists releasing an engineered pandemic. And we noticed that we had done lots of work on trying to reduce terrorist access to the kind of biological agents which could be used to do this, but we hadn’t done any technical work on whether there were technological solutions that could be used to slow down an engineered pandemic, which are maybe different from what happens with just natural pandemics. Then that might be a clue that there could be low hanging fruit in working on that stage of things, and working on maybe there’s something that we can do to reduce the risk conditional on the first thing.
Arden Koehler: So in particular, it might be easier to cut in half the probability of something going from a terrorist attack, or where terrorists release a pandemic agent, to it becoming a large scale catastrophe than it is to cut in half the probability that they even get their hands on the pathogen. And that’s suggested by the observation that there’s more work going into the second thing.
Owen Cotton-Barratt: Yeah. And I think that probably wouldn’t start off being true. And it’s just because hopefully the world is already putting a bunch of attention into making it be that nobody really wants to do this in the first place, and even anyone that does want to, doesn’t have good opportunity. Another example, people sometimes worry about extinction or other existential catastrophe caused by AI and the alignment of AI systems going wrong. And like a stylized cartoon example there, I don’t think this is the most realistic version of the risk, but is if there is an AI system which manages to get a large proportion of power in the world, and then just whatever the objectives of the single system are will determine a lot of what happens in the future, and then maybe if it doesn’t want humans around, there won’t be humans around much longer.
Owen Cotton-Barratt: In that kind of case, I would think it’s probably very hard to do much about the final stage, of after this is already a large catastrophe, there’ll be an adversarial dynamic and there’ll be perhaps some system which is deliberately trying to cause human extinction. And that makes me think, “Okay, let’s not worry about that stage of things. If one is going to do anything about this, it should really happen on making sure that bad things don’t happen in the first place and making sure that if bad things do happen, they are contained and remain small”.
Arden Koehler: Yeah. So I guess one observation that comes from that is realizing that when you have this sort of three stage model and there’s a risk at each stage, conditional on the previous one that the bad event becomes… It gets to the next stage, that it’s not always the thing to do to focus on the biggest risk. So you might think the biggest risk in that chain is the one where conditional on it becoming a large scale catastrophe, humanity is wiped out. Maybe that’s not true, but let’s say it is, that it’s the largest percentage risk. You’re saying that doesn’t mean that we should focus on it because it might still be harder to reduce by the same proportional amount as the earlier risks.
Owen Cotton-Barratt: Right. Right. And in fact, my qualitative conclusions would often be that it’s often good to spread our focus around between the different stages because of this diminishing returns. But my exception to that would be that if the risk is really large at that stage, maybe it’s not worth investing at all. If something is a 10% risk of passing a stage, you might think, “Well, if we do a bunch of work, we might manage to move this down to 5%”. If something is 99.99%, it might be that you do a bunch of work, and you move it down to 99.98%, but that’s doubling the chance that things are okay if it gets to that stage. But it’s very far from halving the risk, the halving the risk is the thing which is comparable to halving one of those small risks, and so that’s a reason to avoid the stages with the very large risks.
Arden Koehler: It seems like you’re saying there might be some pattern where when a risk is really, really large, it might be really hard to cut it in half. But I don’t see immediately why that would be. You might think if it’s really big, then it means that we’re not doing anything about it, so if we just do something, we’ll cut it in half, maybe. You know what I mean?
Owen Cotton-Barratt: I do know what you mean. I definitely have a bunch of assumptions that were going into the thing I just said. And I don’t feel super confident about the particular example of how closely things would adhere to that. But I do think that there is some structure that I’m pointing to, where there’s a question of how to think about probabilities at an intuitive level. And if you’re thinking about what the effects of different probabilities are, then I think just regular arithmetic is often the right kind of tool to be using there. And people learn this in school, and it’s pretty good.
Owen Cotton-Barratt: If you’re thinking about generating mechanisms for systems which spit out probabilities like 10% of the time, “this does this”, versus 1% of the time, then I often think it’s more intuitive to think of probabilities as being stretched out near zero and one on the number line. There’s a technical operation you can use called ‘log odds’, where you take a probability, say it’s 60%, you convert that to an odds ratio, three to two, and then you take the logarithm of that. And the thing which this operation does, is it captures the intuitive sense of maybe moving from 1% to 2% is a bit like moving from 10% to 20%, which is also a bit like moving from 80% to 90%, and a bit like moving from 98% to 99%.
Arden Koehler: Okay. Is this due to mathematical facts, or is this just a claim about how the world works? Once something is really like it’s, I don’t know, it’s going to have self momentum, then it’s hard to stop, or if something’s really unlikely, it’s hard to reduce their chance.
Owen Cotton-Barratt: I guess it’s somewhere in the middle. It’s definitely not about pure mathematical facts, but it is about broad patterns in things. So one reason that something might be 99.999% likely is because it’s overdetermined. It could be that there were five different 10 sided dice that you’re rolling, and the only way that it doesn’t happen is if you roll a one on each of them. And in that case, saying, “Well, we can get rid of one of the dice all together,” or maybe you say, “We’re going to do our safety measure increasing thing; we’ll change it from one of these dice, now instead of you’ve got to roll a one, you could have a one up to five”. You’d think that would be a pretty big deal. But the effect it would have on the probability was it would move it from 99.999% to 99.995%, and that’s still pretty big.
Arden Koehler: Sorry. If you had to get a one, or a two, or a three, or a four, or a five on each of–
Owen Cotton-Barratt: On one of the dice, and then the other dice were–
Arden Koehler: Oh, I see. Yeah. Okay. So if you’re only adjusting one dice.
Owen Cotton-Barratt: Yeah. But maybe if there were five different independent things, and if any one of them comes up bad, then you get the bad outcome. Then it’s over determined, and you would rather not get to that stage of things at all, whereas maybe one of the earlier stages, if the risk is only 2%, then it’s close to being overdetermined the other way. I mean, if the risk is really low, say the risk for one of the other things is 0.01%, then it might be like, “Well, you’re rolling five dice and it’s only if you roll a 10 on all of them that the bad thing happens”. And then if you… I’ve set my example up, so it’s hard to change… Okay.
Owen Cotton-Barratt: Let me say it’s 0.002%, and four of them are if you roll… You’ve got to roll a 10 on four of them, and a nine or a 10 on the fifth one to get the bad thing to happen. And there, if I go in and I change that dice, which is I had to roll a nine or a 10, and I say, “Now you have to just have to roll a 10”, then it’s halved of the entire risk.
Arden Koehler: Okay. So if things are overdetermined, that suggests that it’s easier to have lower risks, at least on this really low side of the spectrum than on the really high side of the spectrum.
Owen Cotton-Barratt: Yeah. Right. So the idea is most of the risks we’re looking at, they’re definitely not happening all of the time. It’s not like humanity goes extinct twice a week. And so at least these risks must be pretty low somewhere, in a feeling where on a week to week basis, it’s overdetermined that this is not going to wipe everyone out. And then something would have to be pretty unlucky to be getting us into the world of, “Oh, this week, we did all go extinct”. So if we can just make that have to be a bit more unlikely, it can significantly reduce the chances, whereas if there’s a stage where it’s actually already very likely that if we get there it advances, then it might be that even a heroic effort will only have a pretty small effect on the overall likelihood.
Arden Koehler: Yeah. Okay. So this is really interesting. So it sort of in a way pushes in the other direction from the sort of argument I made before about neglectedness. I don’t know if it overcomes it, but you might think, “Well, look. We have some good evidence that the sort of first defense layer is relatively good. So actually, maybe we can make the most progress by just strengthening that defense layer a ton until it’s just insanely hard to get even a medium sized catastrophe.”
Owen Cotton-Barratt: Right. Yeah. There is a dynamic like this. But it’s interesting to see which bits of the probability range it’s really biting on because where things are overdetermined towards being safe, it isn’t necessarily much easier to get progress at the stages which are really, really safe, versus the ones which are just kind of safe. If the intervention that you can do is you can go in and change one of your dice from “Oh, as well as all of the other things we need to roll, we need to roll a nine or a 10 on this die”, to “We need to roll a 10 on this die, as well as all of the other things”.
Arden Koehler: Just to make this concrete and make sure that I’m following, an example of what that might be a metaphor for, would be something like making the safety conditions on a lab much better, so that you have to get much more unlucky to get an extremely virulent pathogen escape.
Owen Cotton-Barratt: Right, it might be that one of the particular safety measures, because you might think that, well, to have the virulent pathogen escaping, first of all, it needs to get out of this sealed container which, in theory, it shouldn’t be able to get out of. And then the negative pressure that’s used in the lab system has to fail. And then there’s a chain, and each of these maybe you have to be a bit unlucky. If you’re halving one of those, one of those steps, it halves that overall risk. And that is true independent of whether there were three of those steps or whether there were nine of those steps.
Owen Cotton-Barratt: So this really gives us a reason to prefer to avoid working on the steps where it’s overdetermined that something bad will happen. But it doesn’t particularly push us towards preferring the things where it’s strongly overdetermined that bad things won’t happen. I mean, even if there’s only one of those steps, if you’re halving the risk, it’s still just as good. And so for the stages where we’ve never seen them, we don’t have the repeated evidence that it must be overdetermined that it’s safe most of the time. So we’re going to be falling back onto more of our prior models of, “Oh, how hard would this be?” But a lot of the time, we’d be like, “Oh, yeah. It’s pretty plausible that we could avoid this”.
Owen Cotton-Barratt: And as long as we think that there’s a good chance that we’re in the domain where we could be having a significant effect on the risk, then that could still be a pretty good bet to invest in. And maybe it’s modestly worse on account of maybe it’s over determined bad. And this is different if we actually have a fairly strong looking argument that, oh, no, it’s going to be really hard to stop it at this stage.
Risk factors [00:38:57]
Arden Koehler: So one thing that conceptualizing extinction risk in terms of particular disasters that could get through these different layers and then the defense layers that keep them from doing so, is seeing things that strengthen the defense layers as like interventions for reducing extinction risk. And preventing things that weaken the defense layers as being ways of reducing extinction risk, even if they’re not addressing one of the particular extinction risks in general.
Arden Koehler: So I think you called those the ways that the defense layers might be weak risk factors in the paper. Do you want to talk about just some risk factors that there might be, some examples to just illustrate this idea of things that can make the defense layers more weak?
Owen Cotton-Barratt: Yeah. I mean, one thing might be, if some large fraction of normal societal functioning is falling apart, if we do have some big disaster which is not causing extinction, but now we don’t have functional governments. Maybe that would mean whatever the regular governance function that was being provided there, might not be in place. And whatever kind of practiced function was being provided by that, which was hopefully going to take part in the prevention or response functions might be harder to do.
Owen Cotton-Barratt: I mean, you can imagine how much worse the world would be at pandemic response if we didn’t have any international communication, and we didn’t have anybody who were saying, “Oh yes, it’s my job to think about pandemic response and what we should be doing, and … ”
Arden Koehler: Yeah. Or we were at war or something? Which screws up communication and incentives.
Owen Cotton-Barratt: Yeah. I think incentives can just be a fairly big part of the system. And it might also be that one can have laws at a national, international level, which help to create or undermine the incentives, which will reinforce these defense layers.
Arden Koehler: So we’ve talked now a little bit about the model and some of the sort of things that it makes salient. Are there any weaknesses of the model or the approach? Any things that you think it obscures or anything that you think could be improvements?
Owen Cotton-Barratt: Oh man, there are definitely a bunch of weaknesses. I’m definitely not aware of all of them. I’m probably not even aware of all of the important ones.
Owen Cotton-Barratt: There’s a sense in which the world and all of these different possibilities of things that might happen, it’s just big and messy. And when we build some model like this, we’re focusing attention on some aspects of it. And because attention is a bit of a limited resource, it’s pulling attention away from other things. And so if we say, “Well, we want to analyze everything in terms of these abstract defense layers,” it’s pulling attention away from, “Okay, let’s just understand what we currently guess are the biggest risks,” and going in and analyzing those on a case by case basis.
Owen Cotton-Barratt: And I tend to think that the right approach is not to say, “Well, we just want to look for the model which is making the best set of trade offs here”, and is more to say, “We want to step in and out and try different models which have different lenses that they’re bringing on the problem and we’ll try and understand it as much as possible from lots of different angles”. Maybe we take an insight that we got from one lens and we try and work out, “Okay, how do we import that and what does it mean in this other interpretation?
Owen Cotton-Barratt: So I think that the kind of weakness that this model has is it’s trying to be quite abstract. And it’s trying to say, I mean, I often don’t know that much of the details of the empirics, but in this paper, we’re really trying to hold our hands up and say, “Okay, let’s just assume we don’t know anything and we want to do the kind of analysis that we can without making particular assumptions on that.
Owen Cotton-Barratt: One advantage that has is its robustness. This analysis won’t then be defunct if we later realize that we were wrong about a key empirical assumption. It can also just be helpful to notice which things you can say at what level of generality. Like some of the points we’re making, I’m quite certain had been made about particular risks people were concerned about, just in relation to that particular risk.
Owen Cotton-Barratt: And that is a pretty reasonable thing to do, but I think it can be helpful for giving people a clearer understanding of what’s going on to notice when the features attached to the particular risk versus it’s because it’s an instance of a larger class of things.
Arden Koehler: So an example of that would be this point about cutting in half the probability of it getting from bad to worse, but that’s like true of everything.
Owen Cotton-Barratt: Right.
Arden Koehler: Okay, cool. Yeah. It seems like the framing of let’s think about these particular extinction risks, like maybe risk from unaligned superintelligence, maybe risk from a pandemic. Those make salient the specific causal pathway or something of this particular risk, whereas the model that we’ve been talking about makes more salient the things that we do as a society to keep things from getting from bad to worse.
Arden Koehler: And maybe that’ll be more useful if certain interventions are useful for lots of different particular risks. Like, “Oh, if we just have a really robust system of international cooperation, then that’ll keep a lot of things from getting from bad to worse”. And the extent to which things require really different interventions is the extent to which the other model is probably more useful for thinking about them than this one.
Owen Cotton-Barratt: I also think this is useful for helping us spot things where we don’t really fully understand the risks yet. I mean, there’s degrees to fully understand risk, but I think that we know more about how things might go really badly with pandemics than we do with unaligned superintelligence.
Owen Cotton-Barratt: I’m a little hesitant even to talk about unaligned superintelligence because I worry that it’s putting attention on the wrong thing, or it’s trying to bake in assumptions about what might happen with AI. I spend some of my time thinking about how the future of society might unfold and how powerful AI systems might be a part of that, and I do think there may be some pretty significant risks as part of that and I think it is important to understand. But maybe only a small fraction of those bad scenarios would be well characterized by the cartoon of we have a single superintelligent system and it has a goal which we don’t like, and therefore it does bad things to us.
Owen Cotton-Barratt: And if the best handles that we have are to say, “Well, this is the best story we’ve got so we’ll just do analysis on the basis of that”, it may be harder to notice which bits of the arguments are robust to having a shifting understanding of what the risk is than if we say, “Okay, we’re going to do a top-down analysis and try and say, well, what would need to happen to get these very bad outcomes?” And so try and triangulate what might the space of risks here look like?
Arden Koehler: I see. So this might actually help us come up with more possible causal pathways, things that could be classed as risks from AI might take.
Owen Cotton-Barratt: Yeah. I think that’s right. It gives a little bit of a framework for thinking, and so it can help us direct our attention to, “Oh, but could something fit through this and this and this?” And then maybe that lets us spot things we hadn’t been thinking of.
Owen Cotton-Barratt: I mean, I don’t know how optimistic to be about finding anything that’s particularly important. I also think that if there are particularly important ideas, often it is kind of overdetermined that we would come to them eventually and that there’s like just different ways of finding it. But things which help people organize and tidy their thoughts about important topics, I think, can just help them think more clearly and may make it more likely that they reach important insights faster.
Owen Cotton-Barratt: And so that’s a bunch of the motivation for work of this type. And it’s hard to predict exactly what the consequences of any particular bit of work like that will be, it’s more placing a bet that, on average, this is a good style of thing to be doing.
How likely are we to go from being in a collapsed state to going extinct? [00:48:02]
Arden Koehler: Okay. I’m going to ask a question that maybe you’re not going to have an answer to, or you won’t want to answer because it’s more about some empirical beliefs.
Arden Koehler: So, thinking about this stage that humanity might someday get to, hopefully not where we’ve had a really huge disaster and we want to hope that that won’t cause extinction. Sometimes you will talk about societal collapse causing this and we end up in this really vulnerable state where maybe we have a lower population because there’ve been a bunch of disasters that have caused conflict or disease and institutions have broken down.
Arden Koehler: Do you have any guesses of basically how likely it is we do go from being in a sort of collapsed state to going extinct? Maybe putting aside risks that are more agential, where in the cartoon of the unaligned superintelligent AI, something is trying to go around and actually ensure extinction.
Arden Koehler: But we’re just going along, we’re in kind of a now practically pre-industrial state because there’s been some huge disaster. Do you have any guesses about how likely it is we go from there to going extinct?
Owen Cotton-Barratt: I think it’s an important question and I have some guesses, but I don’t know how much to trust them. It’s the kind of thing where actually I would feel pretty good about some people trying to think seriously about this for longer. I think it is plausible that the correct number is pretty big, like more than 10%. I also think that it’s pretty plausible that the correct number… Well, what I mean by correct number is maybe where I’d end up on this if I spent several years looking into it and had a team of really competent researchers at my command looking into some questions for me.
Arden Koehler: As we hope maybe somebody will be in that position.
Owen Cotton-Barratt: Yeah. I mean, it could be pretty good. But yeah, I think it’s plausible that maybe the risk I’d end up thinking there is below 0.1%.
Owen Cotton-Barratt: And there’s questions both about what will be technologically feasible for a society in this state, like economically feasible. Would we be able to recapitulate the industrial revolution? Some people have argued that maybe this will be hard because the fossil fuels will all be burnt up. On the other hand, it’s likely that humanity would be approaching things from just a pretty different state of the world.
Owen Cotton-Barratt: For most disasters we’d be imagining, there would still be a bunch of metal which has been lying around in these human artifacts, which has already been refined and made into things. And the type of technology that you’re maybe able to build, starting with scavenged things from that, will look rather different than starting from scratch.
Arden Koehler: Also, in at least most of the scenarios that I’m imagining, there’s some knowledge still sitting around. There’s at least books and people can read them about how to do various things.
Owen Cotton-Barratt: The other half of it, aside from the technological and economic, it’s just the societal and maybe almost the political side of things. There’s a question of how would such a society be organized?
Owen Cotton-Barratt: And it feels like at least over the last few hundred years, there have been quite a lot of elements of governance, which pushed towards things like as mentioned in the defense layers and these papers, which try to get good things for people and avoid bad things.
Arden Koehler: Things like outlawing lead paint?
Owen Cotton-Barratt: Things like outlawing lead paint. Things like also pushing against corruption and trying to say we should build legible institutions and we should hold people in power accountable, and we should not let people abuse other people.
Owen Cotton-Barratt: These are all pretty great, in my view, fruits of civilization. And in this world where we imagine civilization has collapsed, there’s a question of what has happened at the social level?And there are some versions of how it could turn out, where I feel much more optimistic than others. And there’s questions of what feeds into the level of wisdom that a society is embodying, which I am pretty interested in, but I don’t think I fully understand.
Arden Koehler: But I guess some disasters could be more wisdom preserving or producing than others?
Owen Cotton-Barratt: Right. Although it seems hard for me at the moment to even predict which ones would be. If there was a nuclear war, I can imagine that might make everybody extremely hostile to each other and might reduce trust. Or it could give humanity the sense of “Gee, war is really bad. We really need to avoid that” and actually create a large coming together and cooperation. And it feels like this might be a really important dimension for effecting how things go for the trajectory of a society which is trying to rebuild.
Arden Koehler: That’s interesting. I hadn’t thought about that. So is this just like another thing that can be different between different imagined collapsed societies which could be really important for whether they’re able to rebuild?
Owen Cotton-Barratt: My guess is that this is the main one or something.
Arden Koehler: Oh yeah, really?
Owen Cotton-Barratt: Sorry, another thing is just what has happened to the environment? And if the society has collapsed and the sun has been blocked out such that it’s hard to grow crops or something, maybe it’s just a fairly hostile environment for humans and it’s going to be much harder for us to get anywhere. And maybe we dwindle and die out over a period of hundreds or thousands of years.
Owen Cotton-Barratt: On the other hand, if the environment is not too damaged, but we’d just experienced societal collapse, then from my position of somewhat limited ignorance, somewhat limited knowledge, I should say, I feel moderately optimistic that purely technologically, it would be feasible to have another industrial revolution and get back up to something like we’ve managed now. And then whether socially we would get there, I have more question marks.
Estimating total levels of existential risk [00:54:35]
Arden Koehler: So before we leave the subject of existential risk, I want to ask you a question about estimating total levels of existential risk. So as I said, we’ve had Toby Ord on the podcast. We’ve also had Will MacAskill. They have really different estimates of what’s the total existential risk of this century. So I think Toby’s estimate is something like a one in six total chance that we go extinct, and Will MacAskill’s is something more like one in 100.
Arden Koehler: So it seems like there are a lot of sources of uncertainty that are basically empirical questions about specific risks and how serious they are. Maybe even broad questions about how good society is at keeping things under control in general. But there’s also this question of how we should even be thinking about this stuff.
Arden Koehler: And one of the questions is, what should our prior probability of us going extinct this century be before we even consider any of these specific arguments about for instance, AI risk or risk from pandemics?
Arden Koehler: And that’s another thing where people disagree. So Will thinks that our prior probability should be really low because of all of the centuries that humanity has lived and might live, it would be surprising if this was the one that this extremely important event happened, whereas Toby thinks that that’s not the right way to think about it. Do you have thoughts on that debate?
Owen Cotton-Barratt: It’s a little hard without talking to them, but I suspect that I’m closer to Toby’s position on this. In particular, with the question of how surprising would it be to find that we are the most ‘something’ among a reference class, I think we should ask what the generating process for the one that we’re looking at.
Arden Koehler: Sorry, the what that we’re looking at?
Owen Cotton-Barratt: I mean, in this case, maybe the century that we’re looking at. We should ask, what led us to be asking about this century here? And if there isn’t some kind of common mechanism which could be leading to this being an outlier on the relevant variable, and this being the one that we happen to be asking about, then I think it’s quite appropriate to be saying, “Yes, we should use this baseline prior of it would be quite surprising if it happened to be the most something, something”.
Arden Koehler: But you’re thinking that the thing that explains why we’re asking about this century and the thing that would explain why it might actually be a really important century, or the most important century are like the same thing. Like, here we are at this age of technological risk, maybe lots of people think, and so that’s what’s leading us to think about this.
Owen Cotton-Barratt: I think that there are different versions of this that you can get into. I think that that is one. I think that you could also say, “Well, we think that this might be a particularly important century because the power of technology has been increasing over time and so the importance of centuries has been increasing over time”.
Owen Cotton-Barratt: And then you might think that this century, at least would be more important than the previous ones. That there’ll be like some monotonic increasing function with that. And then maybe you could also think that that dynamic won’t apply forever. Either we in fact will go extinct, we’ll wipe ourselves out, or we will work out how properly to govern powerful technologies and how to handle them responsibly and wisely. And then after we’ve worked that out, maybe things will be good.
Arden Koehler: Yeah. I guess it seems like that is another big thing that drives people’s different estimates here is, “Do we just need more time to figure out how to keep pace with technological development, keep safety mechanisms and keep becoming wiser, I think in Toby’s words, with technological development? Such that this is an especially dangerous time or is it like, “Well, things are just going to get riskier and riskier and riskier forever until we all die.”
Arden Koehler: And if you think the latter thing, then you’re going to think like, “Well, yeah, this is an especially risky century, but the next century is also going to be even riskier.”
Owen Cotton-Barratt: Right. If you do think the latter, then this could well be the riskiest century ever, but it will just continue until we die. And if we’re in that world, we probably don’t have a huge number of centuries. And if you want to do the reference class forecasting, you wouldn’t necessarily want to camp all of the centuries in the past to say, “Well, there’s been so many centuries,” because if they’re actually getting more dangerous than the ones towards the end of where it’s all concentrated.
Owen Cotton-Barratt: Of course, it’s still unclear. It could be getting more dangerous, but the absolute levels are still quite low and they could remain quite low for many centuries whilst still getting more dangerous. And then eventually they get higher and eventually we get unlucky.
Arden Koehler: Yeah. So I guess one point that people have made is that if we are in the case where things are just getting more and more dangerous forever, then it’s not the case that preventing extinction in the next few hundred years would be as valuable as it would be if we thought extinction risk was then going to get low again later. So Toby makes this point, because it’s like, “Well, there’s fewer generations, in expectation”. So I guess you might think in that case, that’s a world where we can have less impact by reducing extinction risk anyway?
Owen Cotton-Barratt: And if you aren’t sure which of those worlds we’re in, then there’s a bigger prize to be played for in the world where actually we could get into a state where things are stable and go on for a long time. And so I think that even if you’re quite uncertain between those, because the payoff is so much bigger, it might be worth focusing most of our attention on the more kind of optimistic world.
Owen Cotton-Barratt: I guess this is a principle I actually learnt several years ago when I used to play Bridge. And this would be like when you’re playing a contract, if you realize, “Oh, well we’d have to be pretty lucky for the cards to be such that this could even make”. But the thing that you’re trying to do is to make the contract, then you should just assume that the cards lie however they need to be such that you can make it, because that’s the thing which actually gives you a chance of getting out.
Arden Koehler: Yeah, interesting.
Owen Cotton-Barratt: If the probabilities are just going to keep on ramping up forever, then we’re definitely doomed. So we should assume that that isn’t the case for getting our chance of having a pretty large and exciting future. And I am pretty excited about things humanity could go on and do if we get enough time and space.
Arden Koehler: So is it accurate to say, “Well, the reason that we want to put that aside is because it just matters less what we do if that’s the case. So that’s why we want to not be thinking about that world”.
Owen Cotton-Barratt: That’s right. And it’s not that it matters zero in the world, and so I wouldn’t want to totally set it aside. But I do think that that is a reason to skew towards the worlds where we’re not inevitably doomed.
Everyday longtermism [01:01:35]
Arden Koehler: So let’s move on to another topic, which is a sort of new project that you’ve been thinking about.
Arden Koehler: So we’ve talked a lot on this podcast about longtermism, the view that what matters most about our actions is their effect on how the long term future goes. But this still leaves open a million questions about what we should do, even if we thought, “Okay, yeah, I accept longtermism. Now, what do I do?” And many of the answers that people have been discussing have been in the genre of trying to reduce existential risk, because there’s a good case to be made that that’s one of the best ways we can predictably make the long term future a better place.
Arden Koehler: But oftentimes that’s this really specific action. “Oh, we should build this committee. We should do this sort of intervention”. So you have this project, everyday longtermism, where you try to zoom out and ask, “Okay, let’s say longtermism is true; what should everybody do, or most people do if longtermism is true?” Let’s say they’re not placed in one of these particular situations where they can reduce extinction risk or do something else really useful. Do you want to just talk about that project a little bit?
Owen Cotton-Barratt: Yeah, sure. So this is an ongoing project and it started by thinking, “Okay, let’s not ask what the best things to do in the world right now are”, but what would it be good for people to do if we had tens of millions of people saying, “Yeah, this longtermism makes sense. I’m on board. What shall I do?”
Owen Cotton-Barratt: And I realized that I didn’t have a great answer to that. And other people I talked to also didn’t seem to me to have great answers to that. And this felt a bit surprising to me. I kind of feel like the moral arguments for the importance of the long term feel pretty robust. And it doesn’t feel like, “Oh, only a few people should end up buying this”. It feels like, yeah, lots of people should say “That makes sense, I would want to join this”.
Owen Cotton-Barratt: And there’s kind of a feeling of, in developing the ideas, there’s a responsibility to try and ask, “Where is this going? What would it look like properly, if we were getting to the point where there were tens of millions of people who were engaged with this?” And so that was one of the questions I was starting with.
Owen Cotton-Barratt: And then I’ve also been trying to ask slightly abstracted and generalized forms of the question of, what is good to do? Say, abstract such as, “Assume we knew fewer empirical things about the world–
Arden Koehler: Even than we already do?
Owen Cotton-Barratt: Even than we already do, and see what you can get out of the analysis like that. I mean, it’s kind of a similar philosophy and the approach as with this paper I recently wrote on understanding the risks of human extinction, where it was trying to say we won’t look at the specific examples, we’ll try and understand in generalities, and maybe that will give another lens which is useful and will better our understanding.
Arden Koehler: Okay. So before we move on to discussing this more, just on the question of, or the idea that it might be surprising to ask people, “Okay, well, what should tens of millions of people do if longtermism is true?” I guess I feel not that surprised that they don’t have an answer. So one reason that I’m not surprised is that you might expect, well, they should do tens of millions of different things. It depends so much on their situation that I can’t give an answer to what they should do.
Owen Cotton-Barratt: Right. So of course there’s no way that we’re going to be saying now actually all of the details of what tens of millions of people should be doing. And I’m sure that if we get to a world, and I hope we do, where tens of millions of people think this is important, there’s going to be a lot more brain power on it. And there’s going to be a bunch of really smart people and they’ll come up with things that I’m not even dreaming of. And I expect some of them will be great. But there’s still something about having an idea of where it might be going which seems kind of important to me. I think that some people at the moment might hear, “Oh, you think longtermism is important. And you think that that means we should do research on aligning AI systems”. And they might think, “Well, obviously tens of millions of people should, if your argument is right, tens of millions of people or more, maybe hundreds of millions, billions of people should be into this and buy onto that”.
Arden Koehler: And that’s basically because they should be convinced by the strength of the argument.
Owen Cotton-Barratt: Yeah. Because they should be convinced by the strength of the arguments. But if the answer is what people should do is work on AI alignment research, it would be stupid for billions of people to work on AI alignment research. That isn’t how research communities are going to work well. Also, we’re trying to prioritize that above growing food to feed our communities, above educating our children. It just seems like obviously this has been dropping some extremely, extremely important things. Something has gone wrong. Maybe this entire idea set was bunk if it was leading to that implicit conclusion. And nobody of course is saying that we should have billions of people working on alignment research.
Owen Cotton-Barratt: But if what they’re saying is longtermism is important and they’re saying, what can you do about longtermism? Well, the thing we want is people working on this, then that’s an influence that people might draw. And they don’t think they’ll be drawing it explicitly, but kind of implicitly feeling like, “Oh, something doesn’t feel right about this”. And so I think there’s something about just trying to map out, “Okay, what is the world that we’re aiming for in broad brush strokes?” And how do we imagine this is going to balance between the different kinds of things which are important for the world?
Arden Koehler: So it sounds like one way of framing the question is what’s some advice that would be good advice for everyone who wants to make the long-term future go as well as possible? Does that seem right?
Owen Cotton-Barratt: Yeah, that’s right. And I think that that should be, to some extent, regardless of their specific situation, also regardless of some assumptions about empirical details about the world. I think it’s kind of good for us to have a sense of what type of advice would be generically good among a bunch of possible worlds we could find ourselves in. And then we’ll have maybe a better sense of when we specialize down to the world that we’re actually in, which strategies are particularly robust to certain empirical uncertainties, which things are particularly hinging on variables that we are uncertain about and are trying to discover more about.
Arden Koehler: Yeah. I guess another thing that sort of came to mind for me was, well, what’s some advice that won’t change over time? Even if it’s like, let’s say we knew everything about the world, what world we’re actually in, one might hope that if longtermism is a really important idea, it’s something that will last for a long time. And so what are things that people in the future who will be facing really different circumstances? Is there anything we can say to them?
Owen Cotton-Barratt: Right. And I think that I can imagine somebody saying, “Well, but we can just wait until we get there and then we can work out what the advice should be then”.
Arden Koehler: That does feel like a natural thing to say here.
Owen Cotton-Barratt: But I think that there are a couple of reasons to like working out what the advice would be even now. One is after you start saying things to people, it gets repeated, it gets spread. And so in fact, maybe there is some kind of way where we’re speaking to future people. There’s lots of sayings and aphorisms which have kind of passed into folk wisdom today and inform how we go about things which were generated decades or centuries earlier. And so I do think there’s a real mechanism there.
Owen Cotton-Barratt: There’s also a sense of trying to understand important things from lots of different angles. And so we can look at what seem to be particularly high priority things in the world as we see it today with all of the empirical details and then say, “Well, this seems to be a thing to do.” But another way where we can get some sense of what might be good is to try starting with an abstracted analysis and then add in details and try and poke it at each different point. I guess, an assumption behind why that might be a good idea is something I think is it’s really hard to work out what is actually going to be good for the long-term future.
Owen Cotton-Barratt: I have spent several years thinking about what’s going to be good and I have a lot of uncertainty. I know a bunch of other really smart, well-meaning, clear thinking people who’ve spent a lot of time thinking about this. And I think that basically all of them think, “Yeah, it’s hard to know which directions are actually going to be good”. And they do have guesses and they have guesses that they think, “Well, this is significantly better than chance. It’s worth betting on, it’s worth doing”.
Arden Koehler: For a specific actions, you’re talking about.
Owen Cotton-Barratt: For specific actions which will be good. But it isn’t like, “Okay, we’ve got it all worked out. We know which actions are going to be good now. Later we’ll calculate it again. Then we’ll know which things are going to be good then and that part of the process is easy”. I think that part of the process is hard and I think it’s kind of the best game we have and so we need to be playing it. It’s not like, I don’t think we should be throwing our hands up and say, “Well, we can’t say what’s going to be good, so we just won’t do anything. Or we’ll just follow our hunches and not try and do careful reasoning about this”.
Owen Cotton-Barratt: But I think that when we’ve got something that we have relatively poor insight on, then trying to view it from as many angles as possible might help us triangulate what are better ideas. And so I think that trying to say “Let’s generalize, or abstract away from particular people in particular circumstances in the world here, work out what the general answer is”, may help improve our intuitions about which local or specific things are good to do.
What should most people do if longtermism is true? [01:12:18]
Arden Koehler: Okay, cool. So let’s get some specific… Okay, not that specific. Let’s get some ideas on the table for answers to the question, “What should most people do?” And then we can see how they feel like they might be useful for informing us when we’re trying to figure out what particular actions to take. So you gave an answer in this draft that I read having to do with the web of virtue or this idea that we should create and promote virtues that are useful in this regard. Can you just talk a little bit about that answer?
Owen Cotton-Barratt: The thought is if the future is this foggy sea of uncertainty and we think, “Well, we’d like to have good things”, but we don’t really know, we can’t see the details of the path. We can’t map out exactly where things are going to have to be, to get better things in the future. And I asked myself what properties of the future would make me more robustly optimistic around how things are going to unfold from some fixed point in the future? What can I vary there that I’ll feel better about? And maybe there’ll be some important points in the future where things could unfold in a way that’s pretty good or in a way that’s pretty bad.
Owen Cotton-Barratt: I feel better if the actors involved, and I’m thinking kind of people and institutions, are well-meaning and paying some attention to how will this affect the long-term future. And we’d like good outcomes for that please. And kind of fairly clear thinking and trying to be sensible about, well, what should weigh into the decision here and how should this go? And have instincts towards cooperation because I think that sometimes bad things happen in the world where there’s antagonism and there’s a race of people perceiving hostility and so being hostile in turn, and this can–
Arden Koehler: Yeah. Failures of coordination that lead to everybody being worse off.
Owen Cotton-Barratt: Yeah. And so I think, “Okay, well…” And there’s a host of good properties like that. And I also have some guesses as to these, but I’m in the process with a colleague of trying to do a bit more detailed analysis mapping out which of these good properties might be particularly important. And then after we have a sense of what would be good to have there, near wherever these critical decisions are being made, you might think, “Well, it’s good if people can promote these good properties such that they are more likely to end up near the important decision points in the future”.
Arden Koehler: So connect that for me to the answer of the question of what everybody should do if longtermism is true.
Owen Cotton-Barratt: Yeah. So I think that if longtermism is true, my tentative answer is that people should, first of all, try to be good citizens themselves. And I mean, citizen in some broad way. I don’t mean citizen of their country. I don’t even mean citizen of the world as it is today. But I mean, citizen of humanity that’s stretching backwards and forwards through time and looking for being sufficiently helpful for others around them and trying to have good decision-making processes.
Owen Cotton-Barratt: And secondly, trying to spread those properties and trying to encourage others to do the same and to spread some of those same properties as well. And then I also think there’s something about sometimes it’s right to be more strategic about it and to say, “Okay, I actually have a good understanding of how things might turn out here. And so the fog of uncertainty about the future is clearing a bit. We should follow a particular path”. And be prepared to do that when the fog clears enough that that’s the right thing to do.
Arden Koehler: So does this depend… The way that I introduced this section was, “Well, let’s say that longtermism is true. Then what should everybody do?” But in some sense, it seems like, well, everybody should try to be clear thinking and exhibit these virtues. That might be an answer to the question of just even if longtermism isn’t right and it’s not the case that the main, most important thing about our actions is their influence on the long-term future. People should still exhibit these virtues and spread these virtues. Maybe that’s still what everybody should do. Does that seem right?
Owen Cotton-Barratt: I mean, I think there will be significant correlation between the answers to what is good to do if longtermism isn’t true and what is good to do if longtermism is true. I think that there are a lot of these properties that I wanted come down to, well, important people or institutions will make good decisions. And that will often be good for the society of the world of the time, as well as the world stretching into the future. I think that there’s an extra ingredient from longtermism where you want the people to pay attention when they’re making the decisions to the implications on the longer term future, and not just on the local consequences. So I do think that there is a bit of a deviation, and I think that this is an important ingredient that longtermism is adding. But I think it’s my guess is that you would get to a similar answer and that there’s a similar generating mechanism for why you would get to a similar answer.
Arden Koehler: So it sort of adds a virtue. You might think that it wouldn’t be a virtue to think about the long-term consequences of your actions unless longtermism was true.
Owen Cotton-Barratt: I think that one generator that we can also ask to say what is good for people to do, is what do we wish that people hundreds of years ago had done? If we could go back and give advice and say, “Hey, you want to be good. What should you do?” And so if you didn’t care about, I mean, there’s a kind of strong… longtermism, where you’re really thinking well, in the very long term, if we’re looking at hundreds of millions, billions of years, how the future is going to unfold, that’s one thing. You might not buy that, but still care about the world several centuries from now. If you don’t care about the world even a few decades from now or care about it much less than the world of today, then this strategy of indirect spreading of good decision-making hoping that it will lead to better first order things in a bunch of places that aren’t visible to you down the line will feel less attractive to you than just trying to do good things now.
Arden Koehler: Right. So it’s like a form of investment versus acting now to make things better for people. And then this is like, well, maybe you make things a little bit better for people now, but most of the effect of trying to cultivate these societal virtues, institutional virtues, it’s going to be in affecting people who happen to make a big difference with their actions later in the future.
Owen Cotton-Barratt: Yeah. Or affect people who affect people who affect people who, and at some point you want the change to end there, but yeah.
Arden Koehler: So you mentioned a couple of these virtues that you’re imagining: clear thinking, thinking about the long-term consequences of your actions. Are there any others that are worth highlighting that aren’t just super common sense?
Owen Cotton-Barratt: So again, I think there’s going to be quite a lot of correlation with what we think is good from a common sense perspective. I think that there will be more focus on trying to have good, clear understanding of how things might go and truth seeking. I think that we do have a common sense idea that truth seeking is good, but I think it’s particularly important when we’re dealing with a domain where understanding what is true is so difficult and that having better social norms around that could also be particularly valuable.
Owen Cotton-Barratt: I also think that trying to help cultivate the virtues in others might be a little neglected. I think that common sense sometimes says, “Oh no, you shouldn’t be too much of a busybody and meddling with others”. And there’s definitely something to that, but I would love it if we could work out what are good strategies for people to take. How can we most help these things spread without promoting a backlash, without being too much of a meddlesome busybody or whatever the thing is.
Arden Koehler: Yeah, so I guess common sense morality sort of says what really matters is how you act, or maybe and your emotions. And it sort of focuses on you and what’s going on there. Whereas if longtermism is true, then it’s going to matter much more the sort of indirect effects of your actions on other people and the propagation of those effects. So I can see how it might naturally suggest that we should be a bit more concerned with how other people act then common sense morality says that we should.
Owen Cotton-Barratt: Yeah. I think there are some other virtues which are maybe not so baked into common sense morality. I think scope sensitivity and thinking that things which are a bigger deal in absolute numbers should be a bigger deal for our decisions as well.
Arden Koehler: Can you just give an example of that?
Owen Cotton-Barratt: Yeah. Maybe you think you could do something which affects the policy at your kid’s primary school versus something which affects national education policy. And national education policy is going to affect a lot more schools than something that’s local, but that there’ll be some disadvantages of that as well because it might be centralized policies have issues of not being necessarily properly in touch with things which are happening on the ground. But you should at least be cognizant of the fact that it’s a much bigger deal and it’s maybe easier to do something which is bad. And I think it’s also important to be aware of that. But I think that the orient towards “What is actually a big deal?” hasn’t made it into common sense morality, or at least not as a strong component, but I would like that to be a significant component of good decision-making as I see it.
Arden Koehler: So that’s like one of the things that I feel effective altruism is primarily about is to care much more about things that are affecting many more people or that’s at least one of the things that it’s about.
Owen Cotton-Barratt: Right. I think that is one of the things that it’s about. And I think that one of the reasons that the message of effective altruism has been resonant for a lot of people is something like, “Well, this is a component which is missing in common sense morality”, and so they think, “Oh yeah, that thing. We need more of that thing”. And they’re right, and it has been missing a lot. But that’s not the only component of what matters. And if that had become common sense, then I’m not sure how much we’d be promoting it over a lot of other different pieces.
Arden Koehler: I see. So it matters a lot on the margin because it’s not as present as it should be right now.
Owen Cotton-Barratt: Yeah, that’s right. But again, there’s a question of, I think sometimes it’s really good to be in the mode of what is most important at the margin. But it’s also, I think, important to sometimes step back and say, “Okay, forget about current margins. What is important in absolute terms?” And I think one of the reasons that that is a good exercise to do is, I mean, I think it can help you orient towards where you’re trying to get to. And so it can help you notice when you should be shifting direction a bit. I also think that just by being explicit and doing the marginal and the non-marginal analysis separately, you can stop people conflating the marginal analysis for non-marginal analysis.
80,000 Hours’ issue with promoting career paths [01:24:12]
Arden Koehler: Yeah. So one problem that sometimes I think we’ve had at 80,000 Hours is we’ll promote some career path as really promising, but people are sometimes, they’ll object to the idea of really large numbers of people doing that. And we’re like, “No, we were just talking about on the current margin”, and maybe if we did more thinking about, “Okay, but what’s the big, final distribution of what people are going to be doing that we think would be really awesome”, we would be able to head off that misunderstanding more?
Owen Cotton-Barratt: I would love it if you guys did that. I think that it could be really good to talk about, “Well, here is the picture of where we think we should be going and here’s our current picture”. And of course, there’s going to be a lot of uncertainty about it. And we expect to improve our understanding with time, partly, just as if we write down something explicit, people will tell us where we’re wrong and that’ll be great. And then connect the pushes at the margin to “Here is where we think we should be. Here’s why we think we’re particularly underinvested on this one at the moment”. And then that could make it easier for people to really get the sense of that.
Arden Koehler: Okay. Yeah. Well, I’ll put that on the to-do list, figure out what the final picture is supposed to be.
Owen Cotton-Barratt: I mean, of course it’s not going to be the final picture, but we could be trying to project out where we would like to be in 15 or 20 years if we had a lot of people on this stuff. And I don’t want to encourage hubris of say, “Well, 80,000 Hours, you should really imagine everyone in the world will listen to you and–
Arden Koehler: Once the podcast gets to seven billion listeners. Yeah. But we could maybe imagine, “Well, what if many more people listened?” And that is something that we think about. But I mean, oftentimes, it comes hand in hand with trying to figure out what the marginal thing is. Because it’s like, “Well, how do you figure out what is best on the margin?” Part of it is going to be, “Well, what helps you get to this picture that you think is better than where we are right now?”
Arden Koehler: Going back to this proposal that people should basically try to exhibit these virtues, maybe try to encourage them in others, and that that might be, at least at a high level of abstraction, a good answer to the question: what should everybody do if longtermism is true? One worry I have about this is that it seems like in order for this to be a good answer, we have to be pretty uncertain about stuff in general because if we really understood the world really well, the answer would just be do the action that has the best consequences in the long term.
Arden Koehler: But, if we’re really uncertain, why think that it’s going to be easier to come up with good answers to the question “What is a virtue as opposed to a vice?”, than we are at coming up with the specific actions. We say promoting honesty is a virtue, but I don’t know, maybe we’re not thinking through the consequences of what it really would be like if everybody was super honest. Is there a reason to think that we’re going to be better at coming up with what the virtues are than we are at the other thing?
Owen Cotton-Barratt: I don’t think that we should be trying to have a final set in stone answer of what the virtues should be. I see this as an attempt to pass the buck to our future selves or a broader group of future people. And who do I want to pass the buck to? I’d like them to care about the truth, I’d like them to care about doing good things, and I’d like them to be thinking clearly and sensibly about stuff.
Arden Koehler: And maybe be scope sensitive.
Owen Cotton-Barratt: Yeah. Maybe I count scope sensitive as thinking clearly and sensibly. These are pretty blobby…
Arden Koehler: Groups of virtues.
Owen Cotton-Barratt: Groups of virtues I’m pointing to, and then I’m like, can we get more people like that? What could we do to get more people like that? Thinking of this is a locally good direction, and there’s some kind of self-correction in the system that I would hope that we’d be setting up there because if the… This is the best guess I’ve got at the moment, and it’s a somewhat easier problem than what’s the best thing for the world as a whole, because I don’t need to engage with lots of the detail about things in the world. And then if the people who are like, “Oh, now we’ve thought about this more and we’ve realized there’s an argument that actually we should be doing something else here”, then I’m hoping that the good decision-making properties, which I’m wanting to foster here, should enable them to properly follow that and pivot.
Arden Koehler: Part of what I hear you saying is that you are more confident that thinking clearly or being truth seeking is just going to be robustly good because it’ll allow you to come up with the right answers to lots and lots of questions, or something like that?
Owen Cotton-Barratt: Yeah. Generally I feel better about my ideas when I get more information and when I get to spend more time thinking about stuff, and when I think more, “Oh yeah, I’m being clear headed,” not, “Oh, I’m super confused about this.” That feels like one of the most robust things I can come back to; I’m like, “Is more clear thinking better than less clear thinking?” Yes, yes.
The existential risk of making a lot of really bad decisions [01:29:27]
Arden Koehler: We’ve talked a bunch about existential risk generally, and mostly about extinction risks, but one existential risk that I haven’t at least heard talked about as much is the risk that we just make a lot of really bad decisions or something. We don’t blow ourselves up, we don’t even go extinct, we just fail to take really great opportunities to make the world a much better place on an ongoing basis. Sometimes people talk about like, how could you possibly help prevent that? Maybe doing something like this, trying to just improve people’s general values and how they think about things.
Owen Cotton-Barratt: I do think that this could help with that. I guess when you say that, I wonder about versions where there’s an acute version. There’s a particular decision, like setting a constitution for the rest of our civilization for the rest of time, and if that goes badly, maybe that’s bad, versus just an ongoing this continues people making bad decisions. In the continuous making bad decisions, I think we have to ask what is happening? Why wouldn’t things improve? Ask about not just how people are making decisions, but what are the processes that are feeding into people’s decision-making algorithms?
Arden Koehler: I guess you could answer that question like, maybe people mostly care about social status. If people mostly cared about social status, you could imagine that that would create many bad decisions from the perspective of making everyone as well off as possible because if they make someone better off in some way, then they might be inadvertently hurting themselves, and so there would be some explanation. But that would be a theory about people that I’m not saying I hold, but if you had that theory, you might think this was possible.
Owen Cotton-Barratt: Right, and then there’s a question of, if that were true then I think the strategy which I’m recommending of encouraging people to try and identify what are important virtues and then set things up to promote that, would involve people noticing that people care a lot about social status, and often making decisions on the basis of that, and this is going wrong in this way, and this way. And then, hopefully doing some thinking about how can we change that? How can we either make them care about things other than social status, or how can we make it line up so that they get social status for making decisions that we would think are good on other grounds?
Arden Koehler: It seems like this idea of promoting virtue is an answer to the question that I asked about, “What if we just all make a lot of really bad decisions over a long period of time”, but also maybe an answer to the question, “What if there’s some acute point at which we might do something bad without realizing it”, or something like that, because the whole point is that the better this works, the more it’s the case that everyone ends up with these virtues and then whoever’s writing the constitution does the best job they can.
Owen Cotton-Barratt: Yeah, that’s right. It’s trying to be a robust strategy, it’s trying to say, “Look, this is a good thing to do because it’s passing the buck to the future people who have more context on what the problems that they’re actually facing are,” hopefully they have more information. Some of the things which I expect will be encouraged are seeking out more information, and trying to build a better understanding of things. I also do think that sometimes people will have built clear enough models and have a certain enough sense of what is important to do because they should be chasing that directly rather than just trying to improve decision-making across the board.
Arden Koehler: Yeah, that actually leads me to the next question which is, are you thinking that this is actually something we should be doing more of right now? Or are you thinking, trying to promote virtues, trying to exhibit virtues, so that we get other people to exhibit them maybe by example, or is this an answer to the question, “What should we do once we’re done doing all the things we think are really good ideas from a longtermist perspective and we’re out of good…” I shouldn’t say out of good ideas, but out of specific ideas about how we can make things go much better in the long-term. Is this something we should be doing now, or is this a thing to do later?
Owen Cotton-Barratt: I think it’s a little of both. I’m not totally settled on my thinking about this, but it seems to me often some of the exhibiting and promoting virtues is a thing we can do in the course of whatever work we’re doing. We can look for easy opportunities to cultivate good thinking in others, and I feel good about that. We can try to do well ourselves, we can try and practice good decision making. Practicing good decision making would involve lots of making decisions, and that’s how we learn to do things.
Owen Cotton-Barratt: I think that a lot of the people whose work I think looks really good at the moment are often doing a bunch of this, and I’m not sure how much they’re explicitly conceiving of it like this, but they have intuitions of, “Oh yeah, it’s good, one should try and exhibit really clear thinking, good decision making”, and that just feels good to do, rather than they have an explicit model about why that’s going to be a good thing to do. I think that I would feel good about some people doing slightly more targeted things saying, “Okay, this is a major thing that I want to be working on among clear ideas for helping things in the future”.
Owen Cotton-Barratt: I also think that people have differing degrees of a sense of clarity about what would actually be good for helping, and I think that that’s appropriate, but it’s often that people who don’t currently have clear ideas should be more in the mode of, “Okay, well, I will continue to try and understand more about the future, and continue to try and embody good decision-making, and cultivate that and other virtues in people and in institutions around me.” I’d feel very good about that.
Arden Koehler: Do you think there are any trade-offs between cultivating these virtues in yourself and others, and taking more direct action to try to improve the long-term future? Or is this mostly about noticing that when we do that second thing, we also do the first thing, and the first thing is also valuable?
Arden Koehler: I’m especially interested in the question of trade-offs between trying to cultivate this web of virtue that propagates itself, versus doing something where the reason you’re doing it is just that you think, “Oh, I have this great idea for how to make the world a better place going forward into the far future. I’m going to try to execute on that idea right now”.
Owen Cotton-Barratt: Can you give me an example? Because I think the part of my thing is my guess that most of the examples that you’d be thinking of, I’d be like, “Oh, but actually that’s secretly kind of this thing”.
Arden Koehler: Okay. One thing that seems intuitive to me is that if I have a chance to get on some committee that’s going to set policy for AI, and I think AI is going to be really impactful and the policy will make a big difference, I should try to get on that committee and try to impact policy in ways that are helpful. Thinking about virtue doesn’t seem to come into it, and there might be certain times when I could try to focus more on being virtuous versus just trying to think about the end goal of which policy is actually going to be good. My intuition is I should do the second thing. Does that seem consistent with what you’re saying? Do you disagree?
Owen Cotton-Barratt: That seems consistent with what I’m saying. I think that actually in the case of this committee, presumably the reason you have intuitions of, “Oh, it’d be good to be on the committee”, is that you think the committee will be making better decisions if you’re on it than if you’re not on it. You think that you’d be better than the marginal replacement, right?
Arden Koehler: Mm-hmm (affirmative).
Owen Cotton-Barratt: And so maybe you think this committee is the kind of body in the world that you would like to have good decision-making properties, and you’re like, “Oh, I have some of those that I can bring here.” But then also when you’re on the committee, the thing that we’re often wanting to do is be… Probably for most committees which are deciding things around AI, I wouldn’t imagine that it’s that committee decision directly affects how the future unfolds. Maybe in some cases there will be some things, but I think they’ll also be a bunch of things which either constrains other people’s future decisions and maybe we’d like that to be in a way which encourages better decision-making, or just inspires people.
Arden Koehler: This is just to say that if I’m doing the right thing from the perspective of thinking in this way that doesn’t include virtues, I’m also going to be doing the right thing from the perspective of promoting virtue.
Owen Cotton-Barratt: I think that is normally correct. I think that, again, it’s looking at it from a different angle. I think that if we were really happy that we’d got to a right answer, then it should probably be the right answer from another angle as well, and then the other angle gives us a sanity check. If we’re like, “Whoa, this doesn’t make sense,” then actually there’s something to unpack there, and then we can go and check was it our original thing that didn’t make sense, or was it the new angle that didn’t make sense? I think that that’s often a way that we make progress in our understanding of things.
What should longtermists do differently today [01:39:08]
Arden Koehler: Okay, yeah. That made a lot of sense. Do you think that the community of people who are relatively convinced that longtermism is true, or at least that it might be true and want to do things to make the world a better place in the long-run future should be doing anything differently than what they are? Maybe another way of putting this is, if we were more mindful, and I count myself among these people, if we were more mindful of the ways in which our actions had these effects through influencing other people’s properties, would we be doing anything different?
Owen Cotton-Barratt: I think that there’s just a lot of texture to that question. I think that often actually people have instincts coming from their sense of common sense morality towards some of these things and like, “Oh yeah, I want to do this,” and they don’t have necessarily explicit models for it. I worry a little bit about that because then if people put their common sense morality sense of what’s good up against their explicit models, and they’re like, “Well, this explicit model seems to say this doesn’t matter,” then I think that could lead to them doing things which were mistakes.
Owen Cotton-Barratt: For instance, if they did something underhanded to get onto a committee, like this AI committee, and then… It feels like the classical naive utilitarian where it’s not tracking all of the indirect and hard-to-track effects of the actions, and maybe in this case it would be like if this came out, even if it wasn’t very explicit, other people there might get a sense of this person seems sketchy, and then they might associate concerns with outcomes from AI with sketchiness and think, “Well, I don’t want to go anywhere near that”. And that could be bad for improving decision-making.
Arden Koehler: It sounded like right there you were associating common sense morality with the thought that we should exemplify these virtues, and saying common sense morality versus the explicit models, but this doesn’t have to be common sense morality, right? It could be that–
Owen Cotton-Barratt: Right, it doesn’t have to be common sense morality. I think that I want to understand more actually about exactly which things will be captured by it. I think that some things which are baked into common sense morality are probably also important components of this, and I think that in some cases, the things are important for the long-term, but we don’t have good explicit stories about why they’re important.
Arden Koehler: I think we should also expect the set of virtues to be pretty different from the common sense morality virtues. A lot of people say at least that it’s a counterintuitive thing that the long-term future could be an extremely important thing, and if what we think of as the virtues commonsensically weren’t optimized for making the long-term future go much better, and there’s not that much reason to think that they were? There was supposed to be a question mark at that claim.
Owen Cotton-Barratt: Right. I don’t think that the common sense morality virtues were optimized for making the long-term future go well, but I do think that they were probably optimized for making something which is a bit smaller scale go well, and that smaller scale is the society that is functioning over more short/medium term timescales. I think that there will be a lot of points of agreement on what makes things go well in the short/medium term, and what makes things go well in the longer term. I also think there will be significant points of disagreement, but I guess I have the impression that I expect more concordance between these things than you do.
Arden Koehler: Well, I’m not sure that we actually disagree, but I want to push on basically how much we should be worried about our common sense intuitions guiding us in the wrong direction when thinking about what these virtues should be.
Owen Cotton-Barratt: Yeah. I think that I would like to take common sense intuitions as clues and be like, “This seems like it’s the output of a long process,” which is refined down to get things which are good for people to do for societal reasons, let’s try and understand a bit about that.
Owen Cotton-Barratt: I think that’s some pretty hard analysis to do, and to get to the point where we’re confident that we’ve wrapped it up. I think there’s some caution which says, “Look, maybe we haven’t understood why this property is important yet, but it might be, and much more than among the baseline rate of properties that we could pick.” And so, when it’s cheap let’s keep doing this. If it’s strongly in conflict with something else where we’re like, “This would be good to do”, then maybe we say, “We’re going to try gently putting it down and see if anything goes wrong.”
Arden Koehler: Okay. I want to return for a second to the framing of this as a question of what should everybody do if longtermism is true. It seemed like you were saying there aren’t that many, at least really obvious, departures from what people are doing right now if we take seriously this picture because we’re at least trying to exemplify these virtues in many cases. Even just trying to figure out what the best thing to do is, is exemplifying one of these virtues. Does that seem like a satisfying answer to the question of what should everybody do? People might think if they were dissatisfied before with, “If longtermism is true, I don’t know what to do with myself because I don’t know what will make the world a better place”, how does this help them?
Owen Cotton-Barratt: I think if people had a fairly clear sense of what to do before, then this often won’t change much. I think in some cases it might make them more conscious of some factors which are important, and they were like, “Oh yeah, I wasn’t tracking that.” It might adjust the way that they do the thing, but I am not thinking of this as a radical overhaul.
Owen Cotton-Barratt: If people have been thinking, “Gee, I don’t know what to do. I feel stuck and confused, and I want to do something helpful, but I don’t know what that is”, I think this gives a bit more of answer because I think it’s an encouragement to just go about doing some normal engagement with society, and nudging towards trying to have sensible decision-making, trying to cultivate that in others, trying to keep on increasing awareness and understanding of what we can say about the future, such that if down the line they come up with a sense of, “Ah, now I see what to do. I should go and do this”, they’ll be in a position to do that.
Arden Koehler: Okay. I’m picturing in my head somebody who’s like, “I’m a teacher, I read about this longtermism stuff. I got worried that the things that I was doing weren’t really making a huge difference for the long-term future, and I was thinking, what should I do?” Instead, this answer suggests actually you can do a lot in the role that you are already in by exemplifying these virtues, cultivating them in others, because this is actually a mechanism by which the long-term future might be improved.
Owen Cotton-Barratt: Yeah. I think that’s right, and I would love for many people to start by doing things like that. I’m not saying that it won’t make sense for people sometimes to switch jobs or switch even the industry that they’re working in, but I think that it’s a robustly good thing to start doing. I also think that getting a bit of texture on how it goes can improve people’s learning, and can give people more understanding of things which might be helpful for them later down the line choosing things which would be particularly good for them to do.
Arden Koehler: What do you mean by texture? Sorry.
Owen Cotton-Barratt: I mean if something that a lot of us should be doing is helping others to make good decisions and have good decision-making processes, then that’s maybe a hard thing to do, and if you’ve practiced trying to do that, maybe you get more of a sense of how. How do you actually do it? How does it go well? That seems like a good thing for people to practice for lots of different routes and paths that people could end up on down the line. I definitely don’t want to claim that people just reorienting within whatever they were already doing is the best thing for them to be doing, but I think it’s often a good best first step, and then after they’ve done it and they have more time to think, they will have more of a sense of, “Is this the path to continue down? Is there something more radical that they should be doing instead?”
Arden Koehler: Okay, so it seems like this is most useful if we have a lot of time to figure stuff out as a civilization. So we’re not going to get into a scenario anytime soon where we can’t reverse course if we think it’s a bad idea, we’re not going to go extinct anytime soon. And the reason it seems more important in those scenarios is because it takes a long time for these virtues to percolate throughout the world.
Owen Cotton-Barratt: I mean, you can try and be more targeted in where you’re promoting the virtues. And I do think that particularly on broad notions of virtue like understanding this particular technical aspect of how to build virus defense systems or something, most things that we’re going to want to do which are good in some sense like if I’m promoting some virtues. And I do think most people should probably not be totally indiscriminate about where they’re hoping to cultivate this. I think that it’s more useful to cultivate in people or institutions which are more likely to be closer to particularly important points.
Arden Koehler: Or maybe more likely to spread virtues themselves, like a celebrity as an example. It’d be really good to get them to exemplify these virtues, because then they have a lot of influence, or something.
Owen Cotton-Barratt: Yeah. I mean actually, maybe the school teacher also will have more impact on that than a receptionist or something. I do think that everyone in their life will have lots of, at a micro level, lots of opportunities that come along to do things which are, in some sense, in service of promoting these things and they think it’s good to develop the taste for what are good versions of that.
Arden Koehler: So if somebody was going to try to figure out… It seems like one of the big questions here is to try to figure out what are the virtues? What are the things that if everybody did them, or if everyone had these properties, things would be most likely to go well? Do you have any ideas about who might be best placed to do that kind of work? Are we talking psychologists or are we talking sociologists? Do you have anyone in mind?
Owen Cotton-Barratt: Yeah. I mean, I’m trying to do a preliminary version of this analysis with a colleague at the moment. And I think in the process of doing that, maybe we’ll have a slightly better answer to this question. I think that sociologists and economists feel like they’ll probably have more to add than psychologists, although I’m not super confident about that. I also think in some cases, one can try and do analysis of what are places or times when decision-making seem to be particularly good or bad? One could look at accidents and say, and I think there must be a rich study of accidents, although I haven’t got into reading the literature yet and say, what features tended to lead to there being mistakes being made, which would… And so forth.
Arden Koehler: Okay. And the virtues would be the opposite of those features?
Owen Cotton-Barratt: Yeah, exactly.
Arden Koehler: Why economists?
Owen Cotton-Barratt: Often economists study incentives created by institutions and I think that that’s one of the pieces of the puzzle here. It’s people make decisions reacting to the incentives that the institutions impose upon them. And so I’m like, “Well, how do we want to design the institutions such that they create good incentives?” And then there’s a question of, “Well, how do we get to change what our institutions are?” And you can keep on going and I think those will blur a bit together.
Biggest concerns with this framework [01:51:28]
Arden Koehler: So what are your biggest concerns about this framework, this way of thinking about everyday longtermism?
Owen Cotton-Barratt: I mean, I’m not sure that it is complete or coherent at the moment. I guess I think that I’m getting at something important, but it’s not all the way that it will give people clear guidance for actions on what to do in every situation. And I think that there is maybe useful work to be done in thinking about, “Okay, well, how does one take the set of ideas which is trying to be a bit more action guiding” than saying, “Just work out what the best thing to do is, and then go do it”.
Arden Koehler: So one thing would be like doing the work to try to figure out what the virtues really are. Is there another kind of thing you have in mind too?
Owen Cotton-Barratt: Yeah, I think that there are also pretty interesting questions about when to try and be strategic about what can I achieve here? And like, “Okay, I would like to create this change in the world. Now I’ll come up with a plan for that.” Versus trying to be helpful and say, “Look, it just seems like it would be good if someone was providing this kind of information in this context, and then I don’t know how people will use it”, but trusting that other people or information processing systems, which might be bigger than people, it might be communities will make use of the things which one is feeding into that.
Owen Cotton-Barratt: And I think that both of those modes are often pretty good, but they do feel a little bit different. And I don’t know… In lots of particular situations, I feel like I have guesses about which mode is more appropriate to use, but I don’t think I have a good communicable version of, “Oh well, here’s how to decide when to be strategically aiming at a change,” versus when to be saying, “Okay, now I’ve found a system and I just want to be a good citizen in that system, whatever that means at the time”.
Arden Koehler: Just relating this back to the web of virtue idea, are you thinking that the first thing is not engaging in trying to create the web of virtue? And the second thing is? Or are these embodying different virtues? One is being strategic and ambitious maybe in trying to really pursue something that you think is going to be the most high impact in an explicit way. And then the second is the virtues of cooperation.
Owen Cotton-Barratt: I guess that they are, in fact, something like embodying different virtues, but sometimes I would think, “Well, the value of embodying virtues is because then you’ll help spread that virtue”. And I was actually thinking of something else here, even if you said, “Well, I definitely would like to…” You might want to spread the virtue of scope sensitivity and you could come up with strategic plans about, “Okay, well, who are the people who would most benefit from having more of this virtue? And then what do they read? How can we get them to read things which will promote this? Can we go and do testing of how different things we could write will be received and how much people will pick up on that?”
Owen Cotton-Barratt: And I think that is a pretty useful mode to be in where you say, “Okay, we’ve adopted, as a local goal, trying to spread this virtue. And now we want to be strategic about that.” But there’s another mode of saying, I guess even to have too much locally, too much fog of uncertainty about what is going to be the best way of doing things on this and saying, “Look, let’s just try and play a role”; I’m trying to think of an example now.
Arden Koehler: Was it just like, you’re working at an organization that you think does good and you are primarily just… You adopt the sort of local goal of the organization with the hope that, well, that organization is something that’s going to do good? I’m just going to try to make it work better.
Owen Cotton-Barratt: Right. So I think that there’s a kind of easy mode of just wanting to be cooperative when you’re working closely with people that you think have very similar goals to you. I think there’s a broader thing where you think, well, as part of society, maybe if we see someone doing something wrong and we write to them or write publicly explaining why this is wrong. We don’t know exactly how people will respond to that, but you might trust that it will be better to have pointed this out than not to have pointed this out. And the choice of action is not predicated on having a clear sense of exactly how this is going to play out.
Owen Cotton-Barratt: It’s predicated on, well, this feels like a helpful role to play, is to point out the mistake that’s going on and then trusting the system and maybe you don’t think, well, everybody in this is fully aligned with your goals, but that there’s enough that people are trying to do good things and will respond to reasonable arguments that if you do your bit of local cognitive work and then put it out there, it’ll lead to something better.
Arden Koehler: Okay. So we started this section with me asking about weaknesses or worries that you have about the framework. And then… I’m sorry, I’m just trying to get clear on how this is answering that question. So it’s something like, you don’t think the framework can give people guidance about when they should do that kind of thing versus the more straightforwardly strategic kind of thing?
Owen Cotton-Barratt: Yeah. Or at least I don’t feel like at the moment I know how to give people advice on that. It’s something I want to be paying attention to going forwards, maybe in a couple of years I’ll have more of a sense of what to say on that. I also think I can imagine somebody listening to this and thinking, “Oh great. I can just keep on doing the thing I’m doing and I’ll just help good decision making around me and not trying to try and do some of the hard work of peering into the future” and seeing what they can see and if they were going to be strategic, what the best versions of that would look like.
Owen Cotton-Barratt: And I do think it’s actually pretty useful and important for people to be practicing that motion and looking at, well, if we were trying to be strategic, maybe even a bit beyond where we guess is sensible, what would that look like? And trying to stretch that, and maybe you don’t always go and then act on this because you think, well, but we don’t know enough yet to know if that is the right action. But if you’re not in the habit of forming your best guesses about what the more locally strategic actions would be, it’s going to be hard for you ever to notice, “Oh, well, now is actually the right time to go and pursue one of these.”
Arden Koehler: I see. So it’s possible that people might hear this and think, “Okay, well, I’m going to just cultivate these virtues that are going to be useful in lots and lots of cases and not think about cultivating the ones that are going to be what allows them to actually realize that they’re in a high stakes situation and where they can actually have a predictable effect on the future and go for it then?
Owen Cotton-Barratt: Yeah. That’s at least something which I can imagine happening in a failure mode. And I’m sure that there are a bunch of other things that I’m missing with this, and I’m looking forward to developing a richer sense of what seems to be good.
Arden Koehler: Another worry that’s related is that we might just not be as critical as we should be about what the virtues are. So we might think, “Well, I feel like I have some sense of what’s relatively good ways to be, so we’re just going to sort of go with those and maybe default to common sense morality where we shouldn’t”.
Owen Cotton-Barratt: Yeah. So theoretically, this doesn’t feel like it’s a concern, which attaches particularly to anything like this framework. You might just think, “Well, whatever your approach for trying to work out what to do is, there’s a failure mode where you might not be critical enough and you might just go with your first guess is”. And maybe you think it’s particularly likely to be a concern here because there’s this class of things where we’re like, “Oh, we’ve already got some reasonably trained intuitions and we think it’s something in that vicinity”. And so it’s particularly likely that we might want to just rest on our existing guesses.
Arden Koehler: Yeah. Why am I worried about this in this case in particular? I guess it’s something like, I think of the idea that maybe we should cultivate the web of virtue and I get a warm, fuzzy feeling of, “Oh, yay. I get to go and just try to be virtuous by the lights that I grew up with or at least it sort of feels closer to common sense morality, makes it feel less demanding”. I feel like there’s some sort of temptation quality to the way that I hear about this proposal. That makes me think that I might, if I were to try to implement it, end up being like, “Oh, I really want to hold on to the idea of being really friendly as a virtue and not think too hard about whether it really is or something like that.
Owen Cotton-Barratt: Yeah. That makes a lot of sense. And I do feel like trying to have serious conversations about which are important virtues and which aren’t so much, is part of the best version of this project. I do also think that people’s trained intuitions about, “Oh, what seems good?” is probably capturing something. And if they have a feeling of, “Oh, this seems good,” there’s part of them, which is tracking reasons why it feels good. And if they comprehended things and they were like, “No, it isn’t good,” then I think that they would have a relatively easy time letting go of the pull towards it.
Owen Cotton-Barratt: But there’s a question about, how do we get enough reflection on, “Okay, which are the important directions to be pulling in?” And I do think that that is a challenge. It feels like a surmountable challenge, but I agree that there’s a failure mode there. It’s kind of like a version of the failure mode I was just saying, of people not trying to look seriously at the future and where things are going and looking at being strategic there. The working out which virtues, which local directions are good to be pushing in, is just one of the elements of trying to be strategic.
Arden Koehler: Yeah, that seems right. I guess, just on the question of whether people would be able to let go of the feelings of something being good if they were to get convinced that it wasn’t. One reason I’m pessimistic about that is that there’s so much uncertainty here that, yeah, if I could get completely convinced that something wasn’t a virtue, then I think maybe I could let go of the feeling that it was, but there’s always going to be this uncertainty and in the uncertainty some… I don’t know, there’s some sense that it might default to like, “Well, I had this strong intuition and that’s going to play a bigger role because I can’t prove that it’s wrong.”
Owen Cotton-Barratt: I mean, I do think that there could be failure modes of the type that you’re talking about. Although I get worried sometimes about failure modes of the other type where somebody says, “Well, I don’t have an explicit argument that this is valuable, so I’m going to throw it out the way.”
Arden Koehler: Is that part of what this is supposed to help guard against? People think in terms of virtues and that will make them sort of temper their going with the conclusions of explicit arguments in a way that you think is good?
Owen Cotton-Barratt: That’s a pretty interesting question. I haven’t been explicitly thinking about it as one of the purposes of this, but I do think that it sounds like an advantage in a way where I wonder, “Oh yeah, maybe that was implicitly part of what was feeding into my framing here.” Because I think I have seen a pattern of, sometimes people have an implicit feeling of, “Oh, this would be good,” but their explicit arguments don’t contain anything which tracks that. And then later they discover another explicit argument which brings in like, “Oh, there actually was something that the original feeling was conveying and that mattered.” And I think that among people who think, “Well, we just want to follow the arguments to wherever they lead,” there can be a systematic failure mode there.
Research careers [02:04:04]
Arden Koehler: So let’s move onto talking about research careers. You are running the Future of Humanity Institute’s Research Scholars Program, which you talked about a little bit at the beginning of the interview. Do you want to just say more about what that program is and what it involves?
Owen Cotton-Barratt: Yeah. So roughly speaking, we hire people as researchers or hire people to, I think, research in a kind of broad sense here. I think there’s a narrow notion of people… What does research involve? It involves figuring out an answer and writing a paper about the answer to the question that you had at the start. And I think that that can be pretty valuable and important and that there is an important role there. But the broader thing that I want is to give people space to spend significant amounts of time thinking about, “Okay, what is important to do? How can I usefully contribute to that and practice some of that?” And I think that often that will mean interfacing with these epistemic systems across our society, I mean systems which are processing information and directing people, directing people’s attention, finding new ideas, propagating ideas–
Arden Koehler: For example, journals, scientific journals, newspapers?
Owen Cotton-Barratt: Yeah. I think you could think of academia as one big epistemic system. You could think of society as an even bigger one. You could think of the traditional print media as another thing. You can think of little communities on Facebook as… And all of these systems, they have bits of people thinking out there and then ideas bubble up and something happens with them. But the result of all of these systems stitched together in whatever way they are in the world, that leads to what the world that we get tomorrow. And this process of unfolding the future. And…
Arden Koehler: The epistemic systems, plus presumably some other systems that mostly do stuff?
Owen Cotton-Barratt: Right. There are physical systems attached to this, but if you look at things that humanity is doing, they’re often being driven by choices to do things. And if you want to look, “Okay, well, what feeds into those choices?” There’s lots of bits of information and incentives. And so I think that at this broad level, the information processing is affecting a lot of where things are going.
Owen Cotton-Barratt: So we are hiring people as researchers and we’re trying to give them space to understand. So it’s a two year program, we hire people on two year contracts. And we have them come and sit with us here in the Future of Humanity Institute in Oxford. And one of the things which I think is useful about this is just, they have other people to talk to who are trying to think seriously about where, as a civilization, are we going and what are the important parts of that? And that will both be other people on the Research Scholars Program and I’ve been really happy and delighted by how the people that we’ve been able to get on this program–
Arden Koehler: And they’re in their first year, right now?
Owen Cotton-Barratt: So we are coming up to a year and a half into the program. So there are people in a first cohort who have been here, something like 16 months. And then there are people in the second cohort who have just been here a few months.
Arden Koehler: Okay. Got it. And do you take new scholars every year or every two years?
Owen Cotton-Barratt: We’ll take new scholars every year, or at least that’s the current model, and aiming to start in the autumn.
Arden Koehler: And that would just be through FHI’s website, they could find out how to apply?
Owen Cotton-Barratt: That’s right.
Arden Koehler: So it seems like you said there was a great variety of people in this program. Can you say anything about who’s the kind of person who could really benefit from this?
Owen Cotton-Barratt: Yeah. I think a couple of different profiles that I can point to. I think there are some people who feel like, “Oh, I’ve gone and I’ve…” Maybe they’ve been working for a few years and they think, “I have been learning about some system that feels important. I’ve been learning some useful things, but I’m not sure if this is where I should be. Or maybe I could be taking some of this knowledge or other things and finding different, more important domains to be trying to slot into.” And I think that this can be a useful space there for people looking to do that kind of pivot. And we’ve had some people join who have had several years of experience, as someone who did a Physics PhD and postdoc, or somebody who had several years experience in policy or somebody who was a medical doctor.
Owen Cotton-Barratt: And then another class of person that I think could benefit is somebody who just has a lot of ideas and they’re like, “Well, I want to go and do a PhD in something,” and lots of interesting ideas, but they’re not yet settled on which thread feels best to pursue. And if they went directly into a PhD program, they might have to narrow down too much before they were confident about whether that was the right narrowing. And then this would be an opportunity for them to take a bit more time and explore, what do they think about the different possibilities they’re considering? And perhaps to do that in collaboration with other people working on things that they care about and get more of a flavor of, is there a topic that they would like to go and spend several years of their life just working on that topic?
Arden Koehler: So what are some differences between the Research Scholars Program and a Master’s program where somebody might go to learn about whether they want to do a PhD or what they want to focus on in a PhD and where people use them to pivot?
Owen Cotton-Barratt: Yeah. So I think that Master’s programs typically have more taught content, which I think can be pretty great. We aren’t well set up to be providing a lot of that. I think that they are often more focused on a particular area. And so some of the significant differences about the Research Scholars Program are that it gives more freedom to participants. I think it has more contact with other people who are trying to think seriously about, “Okay, what is important? How should I be working that out?” And I think that that kind of community benefit and other people who have similar goals and can push back and say, “No, I think you’re making a mistake here,” can be pretty important.
Arden Koehler: And in particular, people are going to be interested in coming up with projects that are going to be beneficial for humanity?
Owen Cotton-Barratt: I hope so. Yeah.
Arden Koehler: Okay. But that’s the focus of–
Owen Cotton-Barratt: Empirically, yes.
Arden Koehler: So if you’re not interested in that, then you probably won’t want to be part of this program?
Owen Cotton-Barratt: Right. Right. Right. And then another thing which I think is not typical is we have quite a lot of bringing attention to the meta-level of things and saying, “Okay, well, why would we want to do things like this and permission to change it?” And one of the things that I’ve been really happy with is inviting the participants on the program to have some ownership over the program. And if they are like, “Oh wait, this doesn’t make sense to me, we should do it like this instead”, I would be like, “Great. Okay, well, let’s have a conversation about this.” And sometimes I think I’ve persuaded people that, “Oh no. Actually the thing we were thinking of already was sensible.” And there’s been a bunch of other cases where people have complained and said, “We really would like more of a structure to do this,” and ended up thinking, yeah, they were right and they were tracking something important about their experience and this let us work out that we could change it in that way.
Owen Cotton-Barratt: So I think that that has been useful for us in helping to develop the structure of the program. I mean, I’m sure that there’s still a lot of improvements that we could make there. But I also think that there’s something that’s systematically useful about habits of asking about the why’s for systems and trying to look locally. “Okay. What would the best way of setting this up be?”
Arden Koehler: So people are going to be getting practice with research skills and looking for topics, but also maybe for thinking about research and for thinking about research as a process and what good research communities look like and how they function?
Owen Cotton-Barratt: Right. Right. Exactly.
Arden Koehler: Okay. So I guess just to be the most practical possible for one moment, one other salient difference between the Research Scholars Program and the Master’s is that you pay the research scholars?
Owen Cotton-Barratt: That’s right.
Arden Koehler: Instead of them paying you. So just to clear that up for the audience in case it wasn’t clear.
Being a mathematician [02:13:33]
Arden Koehler: Okay. You’re a mathematician. I have some questions from the audience specifically about being a mathematician. Do you have a general view of the utility of mathematical modeling in thinking about existential risk or thinking about other research areas that you’ve done? I guess what an answer to this question might look like is, “I think it’s underapplied. I think there’s so much you can do here that people aren’t doing”. Or, “I think it’s often misleading because of X, Y, and Z”.
Owen Cotton-Barratt: Yeah. I think that mathematical modeling is a pretty interesting tool. I think that when we’re trying to understand things about the long-term future, it is hard to know how to approach it at all. And so I wouldn’t want to throw out anything that looks kind of useful from our tool kit. And I think that mathematical modeling feels like one of the things in the tool kit. I’m not so excited about it that I think, “Oh yeah, everybody should just use this all of the time. But I do think it is worthwhile looking for opportunities to where it could provide something helpful.
Owen Cotton-Barratt: I also have a sense of good ways and less good ways to apply it. I think that one of the things which is pretty useful about mathematical modeling is forcing people to make assumptions explicit. And I think that then that can help have more productive conversations about the assumptions that you were having and surface disagreements that are going on. I also think that one failure mode of mathematical modeling is thinking, well, let’s just try and capture everything and let us put in a term for each of the different effects that we can think about because, well, if there’s something we can think about and we don’t have a term for, our model will definitely be wrong. And I see the temptation to go down that route, but I think it is a mistake.
Arden Koehler: Because the model won’t necessarily be wrong, it’ll just be more simplified, it’ll be missing something, but can still be useful?
Owen Cotton-Barratt: It’s because the model will be wrong but can still be useful. Even if you add the things in the model it’s still going to be wrong.
Arden Koehler: Where wrong means something like, it is not a true description of the world.
Owen Cotton-Barratt: Yeah, exactly, exactly. I can’t remember who said, “All models are wrong, some models are useful.”
Arden Koehler: I have definitely heard that, but I don’t know who said it.
Owen Cotton-Barratt: But I think it captures something pretty important. And if you build a model which has 150 variables, you’re not going to spend that much attention on each of the different variables. And so probably you’ll have guesses for some of them and the guesses will be doing something stupid and then your model will start actually, it isn’t necessarily tracking anything helpful. You’re not going to be. One of the things which can be useful for modeling is doing sensitivity analysis and saying, “Well, how sensitive is it to these different variables?” And understanding what’s coming out of that. If you have too many variables, this is going to become impractical to do.
Arden Koehler: Well, also being, maybe more and more important, if you have more variables, it’s more likely that your model will be too sensitive to what happens to one of them.
Owen Cotton-Barratt: I think another move which can be pretty helpful in model building is trying to have the simplest model, which captures some important dynamic. And so I would think often, well, if I’m trying to model how this will go, maybe I would think about taking some extreme cases that are a bit easy to reason about to provide boundary conditions, which constrain how the model should be set up.
Arden Koehler: It has to get the same answers as our intuitions in those cases.
Owen Cotton-Barratt: Yeah, I wouldn’t want to add in a thing like adding in kind of special cases to the model will mean that it’s less, like you know that it isn’t trying to generalize across the whole thing. I think trying to construct the model such that it doesn’t need that kind of special case behavior, but just gives you the outputs that you’re seeing as a natural consequence of some simple functions underlying it, is a good move to try and make. And then subject to that, I would try and keep the model as simple as possible and say, “What is the simplest model that can give the kinds of dynamics that we’re seeing?” And maybe you’d work out that, “Oh, it seems unimportant to track how much money is being spent on experiments.
Owen Cotton-Barratt: And then you’re like, “Great. Well, let’s see what comes out of that”. And the more that you’re finding interesting dynamics with very simple models, the more I think those might be robust and tell us something useful if we try and project them out into different domains. There is also actually some research which shows that simple models are more likely to give accurate answers than complex models.
Arden Koehler: This is just empirical research.
Owen Cotton-Barratt: Empirical research. In my head, this is now, this is folklore. I can’t provide the references.
Arden Koehler: Okay. Maybe we’ll look them up.
Owen Cotton-Barratt: I bet Rob knows the references, actually.
Arden Koehler: We’ll put them in the show notes, if they exist.
Owen Cotton-Barratt: Yeah. If they’re findable.
Arden Koehler: Yeah. And the explanation for that is just that you’re not going to be handling all of these other variables that well. Theoretically, the more complex model would be the more accurate model and give you the right answer more of the time. It’s just that we are incapable of making it.
Owen Cotton-Barratt: That is my guess as to what the explanation is. I think this research was just empirical research, which ends up being better. I think the research doesn’t necessarily tell you why that is, but that’s why I think it is.
Arden Koehler: It seems like one weakness of explicit modeling is, I guess, I feel like I’ve had some experiences where there’s a model and then it gives one answer and you’re like, “Well, my intuition says it’s the other thing”. And then you just sort of go with your intuition because you don’t trust the model that much. And then my question is, what was the point of having the explicit model in the first place?
Owen Cotton-Barratt: Yeah. I think that sometimes building an explicit model can be helpful for helping your intuition notice an effect that it wasn’t noticing. I think that explicit models are most helpful actually when they’re also simple enough that you can understand them. And so that you can understand the dynamics and get the outputs, the thing that the model was telling you properly to interface with the rest of your intuitions. That’s maybe actually another disadvantage of complex models is, if it spits out an answer at the end, you just have to shrug your shoulders and say, “Look, either we’re just going to ignore this and go with the intuition,” or, “We’re going to ignore the intuition and go with this.” There’s no good way to integrate them and to work out, which side of things is wrong. If you have a very simple model and it is conflicting with your intuition, then you have the ability to go and tinker with it and say, “Huh, that’s surprising. What is my intuition tracking? Or that this model isn’t. Or vice versa?”
Owen Cotton-Barratt: And so you can go and try and step through the different parts of the model and say, “Do I intuitively agree with that? Do I intuitively agree with that? Oh, and there’s the difference.” And then you can say, “Okay, what’s going on there?” Maybe you can try and look something up or just spend longer thinking about it. And then you might realize, “Oh no, I was just missing this important effect and this is why my intuition thinks that wouldn’t happen”. Or you might think, “Oh yeah, now I get it”. And then you update your intuition and then you are like, “Okay, I learned something useful from this model”.
Arden Koehler: Yeah. I guess it can help you decompose your intuition and look at the different parts and actually interrogate them separately, and then be like, “Okay now, which one needs to give?”
Owen Cotton-Barratt: I’ve definitely had the experience of building a model and ending up with a qualitative conclusion, which afterwards I don’t need the model for. I can just remember it and I’m like, “Oh yeah, kind of it feels like I could have just come up with this without the model”. But in fact–
Arden Koehler: Do you have an example in mind?
Owen Cotton-Barratt: Yeah, I do. I wrote an FHI technical report a few years ago on “Allocating Risk Mitigation Across Time”, I think is the title of the report. And the thing I was doing was modeling, “Okay, well, if you have a certain amount of people trying to work on different existential risks or existential risks that might occur at different points in time, then when do you want people to be doing different kinds of work?” And it was making some assumptions about field building and about diminishing returns to different styles of work.
Owen Cotton-Barratt: And the qualitative conclusion was that there is extra reason for people to focus on risks which might occur unexpectedly soon because the people over the few years until they do happen are the only ones who can. Whereas if you are looking at things which might occur much later, there’s a much longer period of time and particularly if the community is of people paying attention to them is growing there’ll be more people in the future who are able to pay attention on doing things which will be useful for reducing those. And so, as a kind of natural comparative advantage across these people at different times, it makes sense for people now to be focusing on risks which might eventuate early, even if they are fairly low probability.
Arden Koehler: Would a way of putting this conclusion in terms of the scale, neglectedness, tractability framework be that risks that might materialize earlier are going to be more neglected because even if there are more people working on them right now, there are going to be fewer people over time because there’s less time for people to work on them.
Owen Cotton-Barratt: That’s right. That’s right. And there’s a bunch of different ways of expressing this conclusion. But empirically, I came up with the conclusion by building a model and tinkering around with it. It wasn’t that complex a model, but it did lead me to something where now I don’t need to show the model to communicate the conclusion. I don’t need to remember it and go, “Oh yeah, there’s this effect”. But I didn’t think of it before I built the model.
Arden Koehler: We definitely have this intuition already that urgency is a reason to work on something. It’s an extra reason to work on something. Or the fact that something will be a risk sooner is an extra reason to work on something. This model helps show why that intuition is not misguided.
Owen Cotton-Barratt: Yeah. There were other reasons why we might also have an intuition towards that. One reason that my colleague Toby Ord has pointed out in an essay called “The Timing of Labor Aimed at Reducing Existential Risk”, is that when you get closer to something that might cause a risk, you probably have a better sense of what’s going on. And if you were many years in advance, then you’re more working in the fog.
Arden Koehler: Right. I think he talks about this in the Precipice a little bit in one of those boxes.
Owen Cotton-Barratt: Oh that sounds right.
Arden Koehler: Yeah. I guess the thing on the other side is just that when you’re farther away, you can invest more and you can get higher leverage from your actions.
Advice for mathematically minded people [02:24:30]
Arden Koehler: Okay. Before we close, do you have any advice for mathematicians or other mathematically minded people who want to figure out how to apply their skills to important problems?
Owen Cotton-Barratt: Yeah. I think there are some pretty interesting things here. I think that learning a bunch of mathematics can be useful for giving better intuitions about some kinds of systems which might be important. This can be for the types of things we want to build mathematical models of, which comes up a bunch in bits of economics. It can also be for understanding the details of technologies which are being developed and what’s going on there. I do think that mathematics can provide a pretty important toolkit. I also think actually mathematical training can provide some other useful habits of thought. And this is something which I feel like I got out of doing mathematics degrees.
Owen Cotton-Barratt: I think that in a mathematics degree, you get trained to be very careful about your arguments and notice, “Okay, which bits are solid and which bits do we need to go and dig in on something there?” And if we’re trying to get new knowledge in a domain where it’s easy to go out and test things, then that isn’t that important because at the end of the day, we’re like, “Well, okay, let’s come back to testing this”. But if we’re wanting to try and understand things where we don’t have good access to empirics, and I think understanding the effects of actions on the long-term future is a pretty important case like this.
Arden Koehler: Yeah. Seems like hard to get feedback on what the effects are.
Owen Cotton-Barratt: We can wait, but then it’s too late. Then we want to be careful about the reasoning and know how solid are each of the different steps and what can probably be rested on that? And I think that the habits of trying to be mentally careful about this and labeling that are quite helpful when getting into this kind of domain. That’s something which I think can matter. And I don’t think that a mathematics degree is the only way to get that kind of approach of thinking about things. In fact, I think analytic philosophy often tries to train a somewhat similar skill. It has, in some ways it seems better and in some ways it seems worse as a training domain. Compared to mathematics it’s much harder to verify when you’ve got things right or when you were making mistakes, which means that maybe mathematics is a particularly good place to really get going. And on the other hand, it does well at engaging with a broader range of types of questions that you might want to be applying careful reasoning to.
Owen Cotton-Barratt: And then I think there are other bits of mathematical style thinking or thinking which is trained in doing mathematics, which I’ve found to be pretty useful. In fact, which I feel like I use in a bunch of the work that I’m doing, even when I’m not applying mathematics. One of the kinds of approaches is to when being presented with a new definition of a claim, trying to test that out mentally and say, “Well, what would that mean for this case? Or this case?” And trying to push it to the types of cases which are easy to reason about and as a way of really more properly integrating a concept into my understanding of the world.
Arden Koehler: Why does that come from your mathematics training? What’s the connection there?
Owen Cotton-Barratt: Oh, I think that this is something which is really useful. In mathematics, you get a lot of definitions thrown at you and you get a lot of theorems thrown at you and you can try and just deal with the symbolic representation of these things but I think that that isn’t the most productive way of understanding things and actually going and doing stuff with them. I think that you go better if you more internalize the definition or the theorem and you really understand what it’s doing and why. For that process of trying to really understand it I think that testing out what it’s supposed to do in different situations can be really useful.
Owen Cotton-Barratt: This is also why, if you meet a definition of things, often it’s given with examples, but I think that that is a habit which has led to, in fact some of the things we’ve talked about in the conversation here, both on the how to build good, simple mathematical models but also some of the earlier topics we talked about. I was saying, “Well, I was just saying there’s something important here. Can we look at it from a slightly more generalized angle and what would that tell us about it?”
Arden Koehler: And that might be one case of putting this thing, this concept, trying to see what happens if it’s in this particular context. What do we think about people trying to build these good epistemic habits in a world where they don’t really know what they’re doing or they don’t really know what has good long-term effects.
Owen Cotton-Barratt: Right. And maybe this is using a different kind of skill, which again has overlapped with mathematics, but there’s something about just trying to say, “Can we look at everything kind of clearly and find the right perspective to have a better understanding of things as a whole?” And I think that that is a pretty central skill in mathematics. I think a lot of mathematics involves that kind of stepping back and saying, “Okay, really what’s going on here?” And then many of what may seem kind of hard technical results end up looking trivial when you find the right perspective and the hard work is how do you find the perspectives which make things easy to understand? And I think that I end up using that kind of skillset in the work I’m doing now and I also think I was doing it with mathematics.
Owen Cotton-Barratt: I’m actually kind of unsure how much mathematics trained that and how much I was drawn to mathematics in the first place because that was kind of compelling to me. But if people are mathematically minded and they’re asking, “What can I do?” I think it’s maybe useful for them to think, “Okay, why am I mathematically minded? Which bit of this is appealing to me?” And maybe I just like playing with technical things and then maybe the useful thing for them to do is something where they’re really interfacing with the technical system and they can use their mathematical knowledge to have better senses of what are the useful steps there. Or maybe it’s something broader and they can take some of these virtues of mathematical thinking rather than just the mathematics itself into other domains.
Arden Koehler: I like this point. Being mathematically minded is maybe not just one thing. There are multiple things that that involves. And if you’re more attracted to the playing with technical systems aspect of it, then that might mean you should, well, you can probably find really valuable projects that involve being immersed in technical systems. But if you find the aspect where it’s you’re going to try to step back and look at something from a totally different angle and see what that tells you about it, then you might work on different sorts of problems. What’s an example of a problem like that?
Owen Cotton-Barratt: I think it is particularly useful for questions where we still have some confusion about what the question is or where we just think this is hard to think about and we have noticed that there’s something important there, but the correct next stage is seek better framings and frameworks for understanding it. I think that this may be true actually for quite a lot of things around the future of AI. I think that the arguments that, “Oh, there’s something pretty important here and this could be a big deal for the trajectory of the future, are pretty robust and this justifies a lot of attention going on the area”.
Owen Cotton-Barratt: But I think that the arguments about, “Okay, well, what exactly are we worried about?” And our answers to that feel kind of flimsy to me. And I would love more careful thinking about what can we say? How can we understand what are the important things to be aiming for here? As well as work which is saying, “Okay well, our best guess is as to what’s important is this. What can we do about that?” I actually think that at the moment, we may well not have found the correct questions, but the work in trying to answer the questions we have found, it can shed light on ways in which the questions we have aren’t the right ones and so that may usefully feed into getting to where we need to be.
Arden Koehler: Yeah. It feels like there’s a lot of just on this particular example, a lot of desire right now, for people to think in this more careful way and looking at things from multiple different angles sort of way about risks from advanced artificial intelligence. And I know we’re going to have an episode with Ben Garfinkel out and some of his thoughts on this, but yeah, so that would be an example of an area where it’s things go kind of hard to think about so we might be able to do this kind of reframing in a way that will make things more clear. Okay, cool. Final question, do you have any advice on how to nerd snipe your friends into working on really important problems? Where that means, make it seem super interesting, but it turns out also it’s really important. Or are you against nerd sniping?
Owen Cotton-Barratt: I’m not against nerd sniping. I think this is a pretty good kind of question. I think that one of the, maybe there’s something, there’s some approach of nerd sniping where it’s, “Oh, we’re going to trick you into doing useful work” or something where I don’t feel excited about it.
Arden Koehler: Not very virtuous.
Owen Cotton-Barratt: There’s a kind of better interpretation of it where it’s looking for ways to give people a way in to thinking about something. And I think that if you are starting with a sense of, “Oh, I have a feeling of there’s some big, important things here”, it can be hard to communicate to others exactly why you think that and it can be hard to properly pull their attention. And they might say, “Look, well even if I did think it was important, so what? How could I do anything?” Whereas I think if you’re able to give a more precise question where they can get a sense of, “Oh, this is interesting”. Like enough context that they can see this could matter because so and so, and a precise enough question that they can start tinkering with it and thinking about it for themselves without needing to understand everything in the broader context, then that can be helpful for them to see, “Oh yeah, there’s something here that I can get a bit of traction with”.
Owen Cotton-Barratt: And so I think that that’s actually one of the reasons that seeking crisp questions can be pretty helpful because if we’re able to express crisp questions, we can go and talk to people and say, “Hey, what do you think about this?” And if there’s a question there, then people can say, “Yeah, there’s a puzzle here. There’s something which is kind of tangible”. They can see there’s definitely a problem. If you have something where it’s just a vague, “Oh, what do you think about AI? Is it going to be important?” I think people sometimes are like–
Arden Koehler: They don’t even know how to go about answering it.
Owen Cotton-Barratt: Yeah. They both don’t necessarily know how to go about answering it, but also they don’t necessarily know that there’s even something substantive there. It can sound like somebody who’s just like, “Oh, what about”, and they just mentioned a bunch of topics.
Arden Koehler: Yeah, whereas if you give somebody a crisp question, it might feel like a puzzle, grab their attention and also make them feel like, “Well, look, if I came up with an answer to this question, that would be a real piece of knowledge” or something like that. And that’s very enticing. Yeah. That makes sense. The answer to this question is something like “Do the kind of clarifying work that is actually just super useful in research”.
Owen Cotton-Barratt: Or find versions that other people have done and find the things which are the somewhat condensed crisp questions. And you can have a motivation which is attached to it, but try and give people, “Okay, what do you think about this?” I don’t know, maybe organize a session with an hour or two on the whiteboard and say, “Let’s chat about this question and see if we can get anywhere.”
Arden Koehler: All right. That’s a good answer. Okay. Well, thank you for talking with me and taking the time to come again on the 80,000 Hours podcast, Owen.
Owen Cotton-Barratt: Delighted to. Thanks for talking for so long.
Rob’s outro [02:37:32]
Robert Wiblin: If you made it to the end of this episode, it might be worth taking the time to investigate the Research Scholars Programme – you can find the link in the show notes.
And if you enjoyed Arden’s hosting skills, you should go back and listen to episode number 78 – Danny Hernandez on forecasting and measuring some of the most important drivers of AI progress, as well as 80k team chats numbers 2 through 4 with Benjamin Todd – on varieties of longtermism and things 80,000 Hours might be getting wrong; the core of effective altruism and how to argue for it; and what the effective altruism community most needs.
The 80,000 Hours Podcast is produced by Keiran Harris.
Sound editing is by Ben Cordell.
The full transcript is available on our site, compiled by Zakee Ulhaq.
Thanks for joining, talk again soon!