Ajeya Cotra on worldview diversification and how big the future could be

By Robert Wiblin and Keiran Harris · Published January 19th, 2021

Ajeya Cotra on worldview diversification and how big the future could be

By Robert Wiblin and Keiran Harris · Published January 19th, 2021

You wake up in a mysterious box, and hear the booming voice of God:

“I just flipped a coin. If it came up heads, I made ten boxes, labeled 1 through 10 — each of which has a human in it.

If it came up tails, I made ten billion boxes, labeled 1 through 10 billion — also with one human in each box.

To get into heaven, you have to answer this correctly: Which way did the coin land?”

You think briefly, and decide you should bet your eternal soul on tails. The fact that you woke up at all seems like pretty good evidence that you’re in the big world — if the coin landed tails, way more people should be having an experience just like yours.

But then you get up, walk outside, and look at the number on your box.

‘3’. Huh. Now you don’t know what to believe.

If God made 10 billion boxes, surely it’s much more likely that you would have seen a number like 7,346,678,928?

In today’s interview, Ajeya Cotra — a senior research analyst at Open Philanthropy — explains why this thought experiment from the niche of philosophy known as ‘anthropic reasoning’ could be relevant for figuring out where we should direct our charitable giving.

Some thinkers both inside and outside Open Philanthropy believe that philanthropic giving should be guided by ‘longtermism’ — the idea that we can do the most good if we focus primarily on the impact our actions will have on the long-term future.

Ajeya thinks that for that notion to make sense, there needs to be a good chance we can settle other planets and solar systems and build a society that’s both very large relative to what’s possible on Earth and, by virtue of being so spread out, able to protect itself from extinction for a very long time.

But imagine that humanity has two possible futures ahead of it: Either we’re going to have a huge future like that, in which trillions of people ultimately exist, or we’re going to wipe ourselves out quite soon, thereby ensuring that only around 100 billion people ever get to live.

If there are eventually going to be 1,000 trillion humans, what should we think of the fact that we seemingly find ourselves so early in history? Being among the first 100 billion humans, as we are, is equivalent to walking outside and seeing a three on your box. Suspicious! If the future will have many trillions of people, the odds of us appearing so strangely early are very low indeed.

If we accept the analogy, maybe we can be confident that humanity is at a high risk of extinction based on this so-called ‘doomsday argument‘ alone.

If that’s true, maybe we should put more of our resources into avoiding apparent extinction threats like nuclear war and pandemics. But on the other hand, maybe the argument shows we’re incredibly unlikely to achieve a long and stable future no matter what we do, and we should forget the long term and just focus on the here and now instead.

There are many critics of this theoretical ‘doomsday argument’, and it may be the case that it logically doesn’t work. This is why Ajeya spent time investigating it, with the goal of ultimately making better philanthropic grants.

In this conversation, Ajeya and Rob discuss both the doomsday argument and the challenge Open Phil faces striking a balance between taking big ideas seriously, and not going all in on philosophical arguments that may turn out to be barking up the wrong tree entirely.

They also discuss:

Which worldviews Open Phil finds most plausible, and how it balances them
Which worldviews Ajeya doesn’t embrace but almost does
How hard it is to get to other solar systems
The famous ‘simulation argument’
When transformative AI might actually arrive
The biggest challenges involved in working on big research reports
What it’s like working at Open Phil
And much more

Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type 80,000 Hours into your podcasting app. Or read the transcript below.

Producer: Keiran Harris
Audio mastering: Ben Cordell
Transcriptions: Sofia Davis-Fogel

Highlights

Worldview diversification

So, Open Phil currently splits its giving across three big buckets, or worldviews… The longtermism versus near-termism split is where the longtermism camp is trying to lean into the implication of total utilitarianism — that because it’s good to cause there to be more people living lives worth living than there were before, you should be focused on existential risk reduction, to preserve this large long-term future where most of the moral value is, on the total utilitarian view.
And then the near-termist perspective, I wouldn’t say it’s a perspective that doesn’t care about the future, or has some sort of hard-line commitment to “It’s only the people that exist today that matter, and we count it as zero if we do anything that helps the future”. I think it’s a little bit more like this perspective is sceptical of going down that rabbit hole that gets you to “The only thing that matters is existential risk reduction”, and it’s sort of regressing back to normality a little bit.
This might come from scepticism of a total view population ethics…it might come from scepticism about the tractability of trying to affect existential risk, or about trying to do things that don’t have great feedback loops. So there’s this tangle of considerations that make you want to go “Okay, let me take a step back and let me try and be quantitative, and rigorous, and broadly utilitarian about pursuing a broader set of ends that are more recognized as charity or doing good for others, and that isn’t super strongly privileging this one philosophical argument”. That’s how I put that split, the longtermism versus near-termism split.
And then within the near-termism camp, there’s a very analogous question of, are we inclusive of animals or not? Where the animal-inclusive view — similar to the longtermism view — says, okay, there are many more animals in this world than there are humans, and many of them are facing conditions much worse than the conditions faced by any human, and we could potentially help them very cheaply. So even if you don’t think it’s very likely that animals are morally valuable roughly comparable to humans, if you think that they’re 1% as valuable, or 10% as valuable, or even 0.001% as valuable, then the vast majority of your efforts on this near-termism worldview should be focused on helping animals.
And so this is another instance of this dynamic where the animal-inclusive worldview does care about humans but sort of ends up focusing all of its energy on this larger population of beneficiaries. And so it’s this same thing where there’s this claim that there’s more at stake in the animal-inclusive worldview than in the human-centric worldview, and then there’s a further claim that there’s more at stake in the longtermist worldview versus the near-termist worldview.
And so, essentially, there’s two reasonable-seeming things to do. One is to allocate according to a credence between these three worldviews, and potentially other worldviews. And then the other is try to find some way to treat the things that each of these worldviews care about as comparable, and then multiply through to find the expected amount of moral stuff at stake in each of these worldviews — and then allocate all your money to the worldview that has the most stuff at stake. Which in this case, most reasonable ways of doing this would say that would be the longtermist worldview.

Effective size of the far future

So the basic astronomical waste argument (Astronomical Waste by Nick Bostrom is the seminal paper of this longtermist worldview) essentially says that there’s a very good chance that we could colonise space and create a society that’s not only very large relative to what could be sustained on Earth, but also very robust, and having a very low risk of extinction once you cross that barrier.
We actually think that’s a pretty important part of the case for longtermism. So, if we were imagining longtermism as just living in the world, where humanity will continue on Earth and things will happen, and it’ll be kind of like it is now, but it might last for a long time, so there may be many future generations… We’re not convinced that’s enough to get you to reducing existential risk as your primary priority.
Because in a world where there isn’t a period where we’re much more technologically mature, and much more able to defend against existential risks, the impact of reducing existential risk today is much more washed out, and doesn’t necessarily echo through all of the future generations, even if there are many of them on Earth.

Why Ajeya wrote her AI timelines report

In 2016, Holden wrote a blog post saying that, based on discussions with technical advisors who are AI experts — who are also within the EA community and used to thinking about things from an EA perspective — based on discussions with those technical advisors, Holden felt that it was reasonable to expect a 10% probability of transformative AI within 20 years. That was in 2016, so that would have been 2036. And that was a kind of important plank in the case for making potential risks from advanced AI not only a focus area, but also a focus area which got a particular amount of attention from senior generalist staff.
And then in 2018/early 2019, we were in the middle of this question of we’re hoping to expand to peak giving — consistent with Cari and Dustin’s goals to give away their fortune within their lifetime — and we want to know which broad worldviews and also which focus areas within the worldviews would be seeing most of that expansion. And so then the question became more live again, and more something we wanted to really nail down, as opposed to kind of relying a bit more on deference and the earlier conversations Holden had.
And so digging into AI timelines felt like basically the most urgent question on a list of empirical questions that could impact where the budget went.

Biggest challenges with writing big reports

One thing that’s really tough is that academic fields that have been around for a while have an intuition or an aesthetic that they pass on to new members about, what’s a unit of publishable work? It’s sometimes called a ‘publon’. What kind of result is big enough? What kind of argument is compelling enough and complete enough that you can package it into a paper and publish it? And I think with the work that we’re trying to do — partly because it’s new, and partly because of the nature of the work itself — it’s much less clear what a publishable unit is, or when you’re done. And you almost always find yourself in a situation where there’s a lot more research you could do than you assumed naively, going in. And it’s not always a bad thing.
It’s not always you’re being inefficient or you’re going down rabbit holes, if you choose to do that research and just end up doing a much bigger project than you thought you were going to do. I think this was the case with all of the timelines work that we did at Open Phil. My report and then other reports. It was always the case that we came in, we thought, I thought I would do a more simple evaluation of arguments made by our technical advisors, but then complications came up. And then it just became a much longer project. And I don’t regret most of that. So it’s not as simple as saying, just really force yourself to guess at the outset how much time you want to spend on it and just spend that time. But at the same time, there definitely are rabbit holes, and there definitely are things you can do that eat up a bunch of time without giving you much epistemic value. So standards for that seemed like a big, difficult issue with this work.

What it's like working at Open Phil

We started off, I would say, on a trajectory of being much more collaborative — and then COVID happened. The recent wave of hiring was a lot of generalist hires, and I think that now there’s more of a critical mass of generalists at Open Phil than there was before. Before I think there were only a few, now they’re more like 10-ish people. And it’s nice because there’s a lot more fluidity on what those people work on. And so there are a lot more opportunities for casual one-off collaboration than there is between the program staff with each other or the generalists with the program staff.
So a lot of the feeling of collaboration and teamyness and collegiality is partly driven by like, does each part of this super siloed organisation have its own critical mass. And I feel like the answer is no for most parts of the organisation, but recently the generalist group of people — both on the longtermist and near-termist side together — have more people, more opportunities for ideas to bounce, and collaborations that make sense, than there were before. And I’m hoping as we get bigger and as each part gets bigger, that’ll be more and more true.

Articles, books, and other media discussed in the show

Ajeya’s work

Forecasting Transformative AI With Biological Anchors by Ajeya Cotra
When will the hardware needed to train transformative AI become affordable? by Ajeya Cotra

Open Philanthropy blog posts, reports, and their general application

Open Philanthropy’s 2017 cause prioritisation update
Some Background on Our Views Regarding Advanced Artificial Intelligence by Holden Karnofsky
How Much Computational Power Does it Take to Match the Human Brain? by Joe Carlsmith
Open Philanthropy’s general application

Nick Bostrom’s work

Astronomical Waste: The Opportunity Cost of Delayed Technological Development by Nick Bostrom
Are you living in a computer simulation? by Nick Bostrom

Other papers

Patience and Philanthropy by Phil Trammell
In defence of fanaticism by Hayden Wilkinson
Eternity in six hours: intergalactic spreading of intelligent life and sharpening the Fermi paradox by Stuart Armstrong and Anders Sandberg
Algorithmic Progress in Six Domains by Katja Grace
Neural network scaling laws across multiple modalities by Rohin Shah
How the Simulation Argument Dampens Future Fanaticism by Brian Tomasik

OpenAI blog posts

AI and Efficiency (OpenAI blog post)
AI and Compute (OpenAI blog post)

Additional links

Phil Trammell on patient philanthropy and waiting to do good
Target Malaria
AlphaFold: a solution to a 50-year-old grand challenge in biology
Robert Miles’ YouTube channel (“Videos about artificial intelligence safety research for everyone”)
The Precipice: Existential Risk and the Future of Humanity by Toby Ord
Should Altruists Focus on Reducing Short-Term or Far-Future Suffering by Brian Tomasik
The Life You Can Save by Peter Singer

TV and movies

Netflix’s The Queen’s Gambit (miniseries)
Girlfriends’ Guide to Divorce (TV show)
Knives Out (film)

Transcript

Table of Contents

1 Rob’s intro [00:00:00]
2 The interview begins [00:03:23]
3 Worldview diversification [00:08:45]
4 Science and policy funding [00:23:10]
5 Fairness agreements [00:27:50]
6 Next best worldviews [00:41:05]
7 Pragmatic reasons to spread across different areas [00:47:39]
8 Effective size of the long-term future [00:57:19]
9 The doomsday argument [01:09:37]
10 The simulation argument [01:16:58]
11 AI timelines report [01:29:24]
12 Recent AI developments [01:39:28]
13 Four key probability distributions [01:46:54]
14 Most likely ways to be wrong [02:11:43]
15 Biggest challenges with writing big reports [02:17:09]
16 Last dollar project [02:25:28]
17 What it’s like working at Open Phil [02:45:18]
18 Rob’s outro [02:57:57]

Rob’s intro [00:00:00]

Robert Wiblin: Hi listeners, this is the 80,000 Hours Podcast, where we have unusually in-depth conversations about the world’s most pressing problems and what you can do to solve them. I’m Rob Wiblin, Head of Research at 80,000 Hours.

A couple of years ago, I asked Ajeya Cotra to come on the show. Unfortunately, at the time she was too busy. But I’m very glad that last year she found time for us, because she is killing it as a grantmaker at Open Philanthropy, where she tries to figure out the highest-impact ways to make philanthropic grants.

In this conversation, we go into some of my favourite juicy topics, like the simulation argument and how likely it is that humans will actually be able to settle other planets or even other solar systems.

But we also cover some more down-to-earth topics, like how Ajeya — and Open Phil as a whole — does research to try to figure out how they can have the biggest impact with their philanthropic spending, and some of the challenges (both personal and intellectual) in doing big research projects, and trying to get them written down on paper in some form before going crazy.

Before the interview, a little bit of personal news, though. As most of you will know, late last year — in October and November — we did a user survey at 80,000 Hours where we got feedback from many of you about how our products had helped you or hurt you, or just generally what you liked about what we did and how you thought we could do better.

It turns out that this show was just incredibly popular. We were really heartened to find out just how many of you thought that it had changed your life, or was one of the most interesting and engaging things, or pieces of content that you enjoy in your life.

As a result of that feedback, and a couple of other things, we’re going to be rejigging our roles at 80,000 Hours a little bit such that Arden Koehler is going to be taking over some of my responsibilities on the written work and the website, thereby freeing me up to spend a bunch more time on these interviews and producing this show, and maybe taking it to the next level — or at the very least, hopefully just producing more content.

Arden is going to be having her hands full with lots of things this year. But hopefully she’ll still make some appearances on the show. I know a lot of you have really enjoyed some of the interviews she did last year, so you certainly haven’t heard the last of her.

If you’d like to do us a solid favour, in exchange for us deciding to double down on the show and make more of it, we could really use your help getting the word out there to people who would value hearing these conversations, and might even be influenced into changing what they decide to do with their life or with their work.

We already have a pretty big audience, and Keiran and I both find that extremely gratifying. But when we do the numbers, we think there’s got to be at least 10 or maybe 100 times as many people out there who would really enjoy listening to the show based on how many subscribers some other similar interview shows have. But we can’t reach all of them and tell them that it exists on our own. So we could really use your help.

If you have a friend out there who you think would also enjoy the 80,000 Hours Podcast, maybe try to think of who they might be and think of an episode that they would particularly enjoy given their interests. Drop them a message now on WhatsApp or Signal or Facebook Messenger or whatever, letting them know that this show exists and suggesting that they check out whatever episode you think is a particularly good fit for them.

We really do appreciate your support in getting the word out there. With that little bit of ado, and hopefully good news out of the way, here is my interview with Ajeya Cotra.

The interview begins [00:03:23]

Robert Wiblin: Today I’m speaking with Ajeya Cotra. Ajeya is a senior research analyst at Open Philanthropy, a large effective altruist-flavored foundation that expects to give away billions of dollars over the course of its existence, and which is 80,000 Hours’ largest donor. Since joining Open Philanthropy in 2016, Ajeya has worked on a framework for estimating when transformative AI may be developed, estimates of the empirical returns to funding solutions to different problems, and how worldview diversification could be implemented in open-source budget allocations. She studied electrical engineering and computer science at UC Berkeley, where she also founded the Effective Altruists of Berkeley student group and taught a course on effective altruism. Thanks for coming on the podcast, Ajeya.

Ajeya Cotra: Thanks so much for having me. Excited to be here.

Robert Wiblin: I hope to get to talk about your work on when transformative AI might show up and humanity’s prospects for settling space, but first: What are you doing at the moment, and why do you think it’s important work?

Ajeya Cotra: I’m a senior research analyst at Open Phil, and like you said, Open Phil is trying to give away billions of dollars. We’re aiming to do it in the most cost-effective way possible according to effective altruist principles, and we put significant amounts of money behind a number of plausible ways of cashing out what it means to be trying to do good — whether that’s trying to help the poorest people alive today, or trying to reduce factory farming, or trying to preserve a flourishing long-term future. We call these big-picture schools of thought ‘worldviews’, because they’re kind of like a mash-up of philosophical commitments and heuristics about how to go about achieving things in the world, and empirical views. I’m looking into questions that help Open Phil decide how much money should go behind each of these worldviews, and occasionally, within one worldview, what kind of big-picture strategy that worldview should pursue. We call these ‘worldview investigations’.

Ajeya Cotra: This is closely related to what 80,000 Hours calls ‘global priorities research’, but it’s on the applied end of that — compared with the Global Priorities Institute, which is more on the academic end of that.

Robert Wiblin: We’ll get to that in just a minute, but how did you end up doing this work at Open Phil?

Ajeya Cotra: I found out about effective altruism 10 years ago or 11 years ago now, whenever Peter Singer’s book The Life You Can Save came out. I was in high school at the time, and the book mentioned GiveWell, so I started following GiveWell. I also started following some of the blogs popping up at the time that were written by effective altruists folks — including you, Jeff Kaufman, Julia Wise, and a bunch of others. I was pretty sold on the whole deal before coming to college, so I really wanted to do something EA-oriented with my time in college and with my career. So I co-founded EA Berkeley, and was working on that for a couple years, still following all these organisations. I ended up doing an internship at GiveWell, and at the time, Open Phil was budding off of GiveWell — it was called ‘GiveWell Labs’. So I was able to work on both sides of GiveWell and Open Phil. And then I got a return offer, and the next year I came back.

Ajeya Cotra: I was actually the first research employee hired specifically for Open Phil, as opposed to sort of generically GiveWell/Open Phil/everything. So I got in there right as Open Phil was starting to conceptually separate itself from GiveWell. This was in July 2016.

Robert Wiblin: Had you been studying stuff that was relevant at college, or did they choose you just because of general intelligence and a big overlap of interests?

Ajeya Cotra: I mean, I had been, in my own time, ‘studying’ all the EA material I could find. I was a big fan of LessWrong, reading various blogs. One thing I did that put me on Open Phil’s/GiveWell’s radar before I joined was that I was co-running this class on effective altruism. UC Berkeley has this cool thing where undergrads can teach classes for credits — like one or two credits, normal classes are like four credits — so having to put together that class on effective altruism was a good impetus to do a deep dive into stuff, and they gave us a grant. Our class was going to give away $5,000; they were going to vote on the best charity to give it to. We got that money from GiveWell.

Ajeya Cotra: But in terms of the actual subject matter I was focused on in university, not really. It was computer science — technically like an electrical engineering and computer science degree — but I didn’t really do anything practical, so it was kind of a math degree. Being quantitatively fluent I think is good for the work that I’m doing now, but I’m not doing any fancy math in the work that I’m doing now. We have people from all sorts of backgrounds at Open Phil: something quant-y is pretty common, philosophy is pretty common, economics is pretty common.

Robert Wiblin: Yeah, there’s a funny phenomenon where people study very advanced maths, and then on a day-to-day basis, it really actually does seem to make a huge contribution to their ability to think clearly, just by their willingness to multiply two numbers together on a regular basis.

Ajeya Cotra: Yeah, totally, totally.

Robert Wiblin: That’s the level of analysis you’re doing. But for some reason, it seems like maybe in order to be comfortable enough to do that constantly, you need to actually train up to a higher level.

Ajeya Cotra: That’s my line. I tell people it’s probably good to study something quantitative because it gives you these vague habits of thought. I’m not sure exactly how much I believe it. I think philosophy does a lot of the same thing for people in a kind of different flavor. It’s more logic and argument construction, which is also super important for this kind of work.

Worldview diversification [00:08:45]

Robert Wiblin: We’ll come back to working at Open Phil and how people end up in those roles and how listeners might be able to do that, if that’s the kind of thing they’re interested in. But first, let’s dive into this really interesting topic of cause prioritisation and worldview diversification, which I guess is a component of what we talk about, as you said, in global priorities research.

Robert Wiblin: At a big picture level, what is the problem of prioritisation between causes and diversifying across worldviews that Open Phil faces?

Ajeya Cotra: So, Open Phil currently splits its giving across three big buckets, or worldviews, which we wrote about in the 2017 cause prioritisation update. So, there’s one big split, which is between longtermism and near-termism. I should caveat everything I’m about to say by saying that this is my perspective on this stuff, and these are really fuzzy concepts to pin down and they’re kind of in flux, so I’m sure that somebody else coming in here who’s done cause prioritisation work would put it slightly differently and sometimes disagree with me. But broadly speaking, there are these two big splits that produce three worldviews.

Ajeya Cotra: The first split is the longtermism versus near-termism split, and this has been discussed as the difference between the person-affecting view of population ethics versus the total view of population ethics — where, roughly speaking, the person-affecting view says that it doesn’t count as good to create additional people who are living lives worth living, where the total view says that creating an additional person who is living a life worth living is in the same ballpark as saving a life, in terms of the moral good that’s being done.

Ajeya Cotra: There’s something to expressing the split in those terms, but I actually think that the distinction is not a purely philosophical one, and I actually think the longtermist camp is more into philosophy than the near-termist camp. So, I think I would characterise the longtermist camp as the camp that wants to go all the way with buying into the total view — which says that creating new people is good — and then take that to its logical conclusion, which says that bigger worlds are better, bigger worlds full of people living happy lives are better — and then take that to its logical conclusion, which basically says that because the potential for really huge populations is so much greater in the future — particularly with the opportunity for space colonisation — we should focus almost all of our energies on preserving the option of having that large future. So, we should be focusing on reducing existential risks.

Robert Wiblin: Setting the scene a little bit, the main problem is that you’ve got this big pile of money and you want to do as much good as possible with it, and you’ve got to figure out how to divide it between the many different problems in the world. And also, I guess, try to figure out when to dispense it — whether it should be now, or whether you should save it and use it later. And basically, different attitudes or perspectives — philosophical or practical — would suggest focusing on very different problems.

Robert Wiblin: And then you’ve got the question of, well, do we go all in on the one that we think is best, or do we split across a bunch of them? And then how would you split it? That’s the kind of big-picture problem that you’re trying to solve with this investigation. And then those are examples of some of the most plausible worldviews on which you might focus.

Ajeya Cotra: Yeah, that’s right. In fact, the question is maybe even thornier than ‘do we go all in on the perspective that we think is most plausible’ — because we could potentially end up in a situation where we want to go all in on a perspective that we actually think gets a minority of our credence, but it’s a perspective like the longtermist view that says there’s so much opportunity, there’s so much effectible opportunity to do good out there, more goodness. So, if you consider a perspective that’s mostly focused on helping people in this generation or the next couple generations — versus a perspective that’s trying to be more ambitious and bring in this opportunity of permanently affecting the entire long-run trajectory — you might say that even if you only have like a 1% or a 10% probability on the second perspective, you should actually put all of your money there, because it’s positing that the world is bigger or there’s more goodness in the world. So, that’s one key question that we wrestle with.

Robert Wiblin: Okay. So, let’s dive more into this kind of normative uncertainty or moral uncertainty aspect. What are some approaches that you could take to decide how much weight to give to these different philosophical positions? I guess you’re pointing out that you might think you want to allocate them in proportion to their relative likelihood, but that runs into the problem that some of the views suggest that there’s a whole lot more good that can be done than others, and then maybe that’s the key issue that you have to find some way to work around.

Ajeya Cotra: Yeah. Let me quickly lay out the three worldviews I was alluding to before. So, the longtermism versus near-termism split is where the longtermism camp is trying to lean into the implication of total utilitarianism — that because it’s good to cause there to be more people living lives worth living than there were before, you should be focused on existential risk reduction, to preserve this large long-term future where most of the moral value is, on the total utilitarian view.

Ajeya Cotra: And then the near-termist perspective, I wouldn’t say it’s a perspective that doesn’t care about the future, or has some sort of hard-line commitment to “It’s only the people that exist today that matter, and we count it as zero if we do anything that helps the future”. I think it’s a little bit more like this perspective is sceptical of going down that rabbit hole that gets you to “The only thing that matters is existential risk reduction”, and it’s sort of regressing back to normality a little bit.

Ajeya Cotra: This might come from scepticism of a total view population ethics…it might come from scepticism about the tractability of trying to affect existential risk, or about trying to do things that don’t have great feedback loops. So there’s this tangle of considerations that make you want to go “Okay, let me take a step back and let me try and be quantitative, and rigorous, and broadly utilitarian about pursuing a broader set of ends that are more recognized as charity or doing good for others, and that isn’t super strongly privileging this one philosophical argument”. That’s how I put that split, the longtermism versus near-termism split.

Ajeya Cotra: And then within the near-termism camp, there’s a very analogous question of, are we inclusive of animals or not? Where the animal-inclusive view — similar to the longtermism view — says, okay, there are many more animals in this world than there are humans, and many of them are facing conditions much worse than the conditions faced by any human, and we could potentially help them very cheaply. So even if you don’t think it’s very likely that animals are morally valuable roughly comparable to humans, if you think that they’re 1% as valuable, or 10% as valuable, or even 0.001% as valuable, then the vast majority of your efforts on this near-termism worldview should be focused on helping animals.

Ajeya Cotra: And so this is another instance of this dynamic where the animal-inclusive worldview cares about humans but sort of ends up focusing all of its energy on this larger population of beneficiaries. And so it’s this same thing where there’s this claim that there’s more at stake in the animal-inclusive worldview than in the human-centric worldview, and then there’s a further claim that there’s more at stake in the longtermist worldview versus the near-termist worldview.

Ajeya Cotra: And so, essentially, like you said, there’s two reasonable-seeming things to do. One is to allocate according to a credence between these three worldviews, and potentially other worldviews. And then the other is try to find some way to treat the things that each of these worldviews care about as comparable, and then multiply through to find the expected amount of moral stuff at stake in each of these worldviews — and then allocate all your money to the worldview that has the most stuff at stake. Which in this case, most reasonable ways of doing this would say that would be the longtermist worldview.

Robert Wiblin: So, that’s one way of dividing things up according to, I guess, two different potential disagreements within moral philosophy. But I saw in your notes that maybe Open Phil is leaning more towards thinking about this not just in terms of moral philosophy, but also just thinking about it in terms of dispositions, or attitudes, or these worldviews as not just representing formal positions that one may take on core philosophy questions. Can you expand on that a bit?

Ajeya Cotra: Yeah. I mean, I think that the longtermist versus near-termist split is a good illustration of this. This is a super important split, and it comes up again and again, more so than the animal-inclusive versus human-centric side of things. But it’s not the case that everyone on the near-termist team doesn’t care about the long-term future or wouldn’t do things that would help people that don’t exist yet. So, a lot of the poverty and disease reduction work that they’re funding ends up helping people that aren’t yet born, because it reduces the incidence of malaria in a region or something like that. Target Malaria is a great example of something we funded on the near-termist side of things, that’s this very ambitious plan of trying to eradicate malaria entirely. And it’s sort of commonsensically part of a case of that thing, that future generations who might have faced malaria won’t face malaria anymore. And that’s sort of the way we think about it quantitatively.

Ajeya Cotra: The difference, I would say, is that the near-termist side of the organisation cares about the future in a kind of atheoretical commonsensical way, a way that broadly altruistic people tend to care about the future. So, they place value on reducing climate change, and they place value on eradicating diseases, in part because future generations will be helped too, but they don’t tend to go in for this ‘bigger world is better’ thesis. And then they also just kind of feel uncomfortable with basically throwing out all of the goals that seemed like good goals from a commonsensical perspective of helping others selflessly, in order to focus on this one goal that… You know, reducing risks of a pandemic, or reducing risks of nuclear war, that was part of a portfolio of things people cared about from a common-sense values perspective, but they weren’t nearly so dominant. And it wasn’t for this reason of “Space colonisation might allow us to have such a huge population in the future”.

Ajeya Cotra: I would characterise the distinction as the near-termist side is less into doing this kind of philosophy and biting that particular bullet, not so much that it has a philosophical commitment that the future doesn’t matter, or creating new people is always morally zero or whatever.

Robert Wiblin: Yeah. So, I guess one view would be that having these atheoretical commitments to doing stuff is just a total mistake, and that those views should be ignored because someone just simply hasn’t really thought clearly about what they’re accomplishing and what they value. But it sounds like you’re a bit more sympathetic to the atheoretical approach, and maybe you think that there’s something to be said for it, maybe even on a rigorously philosophical point of view.

Ajeya Cotra: Yeah. I mean, I don’t know that there’s necessarily something to be said for it on a rigorously philosophical point of view, but I think there’s something to be said for not going all in on what you believe a rigorously philosophical accounting would say to value. So, I think one way you could put it is that Open Phil is — as an institution — trying to place a big bet on this idea of doing utilitarian-ish, thoughtful, deep intellectual philanthropy, which has never been done before, and we want to give that bet its best chance. And we don’t necessarily want to tie that bet — like Open Phil’s value as an institution to the world — to a really hyper-specific notion of what that means.

Ajeya Cotra: So, you can think about the longtermist team as trying to be the best utilitarian philosophers they can be, and trying to philosophy their way into the best goals, and win that way. Where at least moderately good execution on these goals that were identified as good (with a lot of philosophical work) is the bet they’re making, the way they’re trying to win and make their mark on the world. And then the near-termist team is trying to be the best utilitarian economists they can be, trying to be rigorous, and empirical, and quantitative, and smart. And trying to moneyball regular philanthropy, sort of. And they see their competitive advantage as being the economist-y thinking as opposed to the philosopher-y thinking.

Ajeya Cotra: And so when the philosopher takes you to a very weird unintuitive place — and, furthermore, wants you to give up all of the other goals that on other ways of thinking about the world that aren’t philosophical seem like they’re worth pursuing — they’re just like, stop… I sometimes think of it as a train going to crazy town, and the near-termist side is like, I’m going to get off the train before we get to the point where all we’re focusing on is existential risk because of the astronomical waste argument. And then the longtermist side stays on the train, and there may be further stops.

Robert Wiblin: Yeah, interesting. I like the idea that rather than thinking about this as exclusively a philosophical disagreement, think about it as a disagreement on the strategy question of, what’s our edge? What’s our edge over everyone else who’s trying to do good? And one of them is, “Well, we’ll be better at philosophy, and we’ll reach more philosophically rigorous conclusions”. And the other people are like, “We’ll be better in some other way. We’ll be more empirical, or be more careful about thinking about…”

Ajeya Cotra: More quantitative, yeah.

Robert Wiblin: More quantitative, exactly.

Ajeya Cotra: I mean, I actually think the near-termist side of the organisation empirically uses quantitative estimates way, way more than the longtermist side of the organisation does. So, on the longtermist side, we’ve talked ourselves into highly prioritising causes where there are only like 10 people working on them. And so most of our effort is trying to convince potential grantees — potential people who could be helpful in this mission — that it’s reasonable to work on at all. And trying to fund people who are trying to do the basic thing that we want to do — for example, reducing global catastrophic biorisks as opposed to focusing on biorisks in general. And that is where almost all of our selection pressure has to go. But on the near-termist side of things, they’re looking at lists of hundreds of things they could focus on, like air pollution in India, or migration from low-income countries to middle-income countries. And they have a huge list of causes and they’re just doing the math on the number of lives that get better per dollar with each of these options.

Ajeya Cotra: So, the feel of doing near-termist work at Open Phil is definitely much more quantitative and rigorous, and in some sense it feels more like what you would have thought a cartoon EA foundation would feel like, because they have more opportunity to map things out.

Science and policy funding [00:23:10]

Robert Wiblin: So, I guess we’ve listed three cluster worldviews. One is helping people now, another one is helping animals now, and the other one is helping people and animals in the longer term. Are there any others that we should have in mind that you have on the shortlist of different hats that you put on?

Ajeya Cotra: Yeah, there are a couple of smaller ones. So this always goes back to the fact that we want to be a strong foundation that’s making the most diversified bet we can make on deeply rigorous, thoughtful philanthropy that’s truly about helping others, rather than our particular personal values on causes. And so within that, these really feel like the big three to us, I would say. But there are also other things we would like to get experience with.

Ajeya Cotra: When we were starting out, it was important to us that we put some money in science funding and some money in policy funding. Most of that is coming through our other causes that we already identified, but we also want to get experience with those things.

Ajeya Cotra: We also want to gain experience in just funding basic science, and doing that well and having a world-class team at that. So, some of our money in science goes there as well.

Ajeya Cotra: That’s coming much less from a philosophy point of view and much more from a track record… Philanthropy has done great things in the area of science and in the area of policy. We want to have an apparatus and an infrastructure that lets us capitalise on that kind of opportunity to do good as philanthropists.

Robert Wiblin: Yeah. I guess the science funding reminds me of this kind of ‘progress studies’ perspective, which has been generating a bit of buzz on blogs and on Twitter, and I guess their thinking is something along the lines of, I don’t want to just think about all of this moral philosophy and theoretical stuff, but if I look back over the last 1,000 years, what has made things better? It’s science research, technology research, and economic growth. I don’t necessarily have to have a theory of how that’s made things better, I just want to keep pushing on this thing that seems to be the fundamental driver of the world becoming less barbaric. And so they have the whole story about how they want to speed up scientific research, improve funding to direct it to better people and better projects, and increase it, and so on.

Ajeya Cotra: Yeah. And I think there’s really something to that. So, I feel like this isn’t Open Phil’s primary bet, but I could imagine in a world where there was a lot less funding going to basic science — like Howard Hughes Medical Institute didn’t exist — then we would be bigger on it. Going back to the bet of trying to do deeply thoughtful intellectual philanthropy to help others, we could have looked back and seen wow, basic science has been a really big deal for humanity. And then we could have looked around and seen that basically nobody’s acting on this, and wanted to go in much bigger on the bet.

Ajeya Cotra: And so it’s really responsive, also, to this thing you were saying about what we think our competitive advantage is. And we do think our competitive advantage is more of a top-down kind of thinking, both on the near-termist side and on the longtermist side, where the near-termist side is kind of surveying this large array of possibilities to help others in the world today and is picking the one that quantitatively seems most efficient, and the longtermist side is sort of stepping back even further and thinking about, at the root, what kinds of things even plausibly could be the most valuable thing to do on a total utilitarian perspective?

Ajeya Cotra: So, we’re mostly very top-down, but part of the reason we have the basic science program is this kind of bottom-up, very atheoretical argument that, look, this has been a huge driver of human progress and human flourishing.

Robert Wiblin: Yeah. I feel like this has been a banner week for the progress studies worldview. Because you look at politics and like, oh my God, this is just… But every day, there’s some amazing scientific breakthrough coming out. I guess on Monday we had AlphaFold… I’m just trying to remember all of them off the top of my head.

Ajeya Cotra: I mean, the mRNA vaccine, right?

Robert Wiblin: Oh, the mRNA vaccine, yeah. So, the vaccine stuff is coming along. It seems like the scientific community has really been killing it on COVID in the big picture.

Ajeya Cotra: Totally, totally.

Robert Wiblin: We have made massive progress now settling the protein folding issue, which has been around for many, many decades. On a gut level, I find the ‘let’s just improve wisdom, let’s just improve science’ thing to be quite appealing, then maybe on a more philosophical reflection, it seems a bit more questionable. Do you want to comment on that?

Ajeya Cotra: Oh, I was just going to say, I mean, the other thing you mentioned as kind of part of the same thing, but I think it could really be broken off into a different thing, is the economic growth worldview. So, the Tyler Cowen thing you were alluding to is very much… I don’t know how much he leans on science so much as growth has been really good. Like growth has been so much better for human welfare in the history of the last few hundred years than redistribution has been. And, interestingly, it’s not clear exactly whether that fits in the longtermist camp or the near-termist camp, and could potentially become something that we take seriously enough and think is neglected enough that it might be another worldview that we want to put some weight behind.

Fairness agreements [00:27:50]

Robert Wiblin: Okay. Let’s head back to the question of how you would split all of the money that you have between these different mental buckets. You’ve come up with a couple of different ways of thinking about this, and one you called ‘fairness agreements’. What’s that one?

Ajeya Cotra: Yeah. So, just to set the scene a little bit, the question here is: What happens when you have two worldviews, one of which is trying to help a certain set of beneficiaries, say, and then the other values those beneficiaries, but also cares even more about a much larger set of beneficiaries? And so you see this dynamic with animal-inclusive versus human-centric and with longtermism and near-termism. And I see basically three things that we could do, and I would personally want to do a mix of those things. One is to allocate all the money to the worldview that has the most at stake — that says the world is biggest and it contains the most moral value — which would be the longtermist worldview in this case. Two is to allocate according to credence, like you mentioned before.

Ajeya Cotra: And then three is the thing you were saying, which is called fairness agreements. The idea is that if you imagine these worldviews as people who were each given a third of the money before they knew any pertinent facts about the world, and then you woke up and you discovered — in an extreme case, if you woke up and discovered, actually, we seem extremely safe in terms of existential risk, the biggest risk we can think of is asteroids, and there’s a one in 10 million chance that they kill us, and we have all these detection programs…and we’ve really determined we don’t think AI is a big risk, we don’t think biorisk is a big risk, we don’t think nukes are big risk — then it would feel kind of unfair to keep having a third of the money on the longtermist side, because the longtermist side probably would have made the deal before knowing anything that it would give away its money in these very, very low existential risk worlds in exchange for having more of the money in these very high existential risk worlds.

Robert Wiblin: Yeah. I guess a really stark example might be the animal-inclusive person and the human-centric near-termist person trying to negotiate ahead of time what will happen, and then they wake up in a world in which it turns out that there are no nonhuman animals. In that scenario, presumably, they would bargain ahead of time to pass the money on to the human-focused person, because that’s just very efficient from both of their perspectives before they know what world they’re actually going to end up in.

Ajeya Cotra: Yeah, exactly. And so I think the basic idea here is quite compelling, but the tricky bit that makes me not want to put too much of the capital into these fairness agreements is: What is that prior? So, I said something fuzzy like these three worldviews are talking ‘before they know the pertinent facts’, but how do we approximate what they would have thought? So, if we look around at the world today, a very interesting question that is really tricky to determine is like, are there more animals suffering on factory farms than we should have expected, or fewer? And that’s just super dependent on what you thought your prior was. So, on an intuitive level, it seems like there’s a horrifyingly large amount of animal suffering going on. But on some sort of prior, is that the median trajectory we should have expected, say, knowing what we did at the start of the Industrial Revolution, about incentives to create these kinds of systems?

Ajeya Cotra: And then an even trickier one is, there are so many stars in the future. Is that more or fewer than we should have expected there would be? There’s a lot of them, but there could have been more…there could have been infinity stars. So, it depends on where you place that veil of ignorance in order to determine how to do your fairness trades. And we have a few ideas, but I don’t think any of them are super knockdown.

Ajeya Cotra: One idea is just to look back on history and think, let’s take these three worldviews, maybe since philanthropy was a thing, starting in the 1800s or something. Where in time would these worldviews have wanted to allocate their money? And so you might have thought, “Okay, well, the global poverty worldview, it might have been better to transport that money back in time to the 1910s, or the 1950s, at a time when there was even more extreme global poverty, but still the means to try and address it, because there was still international communication”, or something. And then maybe you think that the animal-inclusive worldview and the existential risk reduction worldview would have wanted to have their money roughly now. So, if you imagine making the trade back when philanthropy started to be a real force on the scene, then you might say, “Okay, global poverty had its moment, more so than now, it’s still very much its moment now, but more so than now it was global poverty reduction’s moment in the ’60s”. So, that’s something you might say.

Robert Wiblin: How is this different from the same issue that you would face just doing the Rawlsian veil of ignorance thing to think about what moral principles we should follow? Because this obviously is a kind of old idea. I suppose, in that case… So you’ve got everyone debating behind this veil of ignorance, and maybe in that world it’s okay for them to see how the world is, it’s just that they don’t know which person they’re going to be. And so we just get rid of that one piece of information, and then they go away and debate it, and that feels less arbitrary. Whereas with this case, it’s like, we can’t tell them that much about the world because then they would know what corner to back more. Or I guess maybe you have to think, well, they don’t know what moral views they have. Was that the thing to…?

Ajeya Cotra: Yeah, sort of. I think that the way to translate it into the clean Rawlsian framework is, you don’t know which worldview you are, but you just see the world as it is today, and then you try to think about which worldview you would rather be on its own terms. So, with the Rawlsian veil of ignorance, you don’t know what person you’re going to be, but you sort of assume these people all have the same values, in terms of like, they want to have better lives rather than worse lives. But the reason we can’t do that so cleanly with the worldviews thing is you don’t know how to translate a standard deviation of goodness, say, across these different metrics that you’re using, because that’s kind of the whole question of worldview diversification.

Ajeya Cotra: The thing we’re trying to replicate is if you had a probability distribution over how much good you could do per dollar before observing the world — before observing something, like taking away some information — you want to generate these probability distributions over what is the chance of saving the world per dollar on the longtermist worldview, and how many hen lives you can save per dollar on the animal-inclusive worldview, and how many human life-years can you preserve per dollar on the human-centric worldview?

Ajeya Cotra: So, one potentially natural way to do this — that I think is actually kind of my favourite — is just think about what GiveWell and Open Phil people themselves thought as they were getting into this business in 2010, or something, and when you just actually notice yourself feeling surprised that it’s so easy to help chickens, say, put more into that worldview. And that’s something that’s kind of been done organically, and it’s also something we could roll forward, right? Like, we could agree now that if AI risk is either more tractable to address or bigger than we currently think it is, then the longtermist worldview gets more money relative to a world where we’re more fine than we think we are.

Robert Wiblin: Yeah. So, I guess with this, you’ve got something that’s very theoretically appealing, but then it feels like the point from which you’re starting — or the views that you had and you’re updating from — just feels kind of arbitrary. And so it’s like why should we privilege the views of Open Phil staff in 2010? That feels a bit strange. It has a less philosophically pure aspect to it. Feels very messy at that stage.

Ajeya Cotra: Yeah. I mean, one thing that I think is very trippy about this is that if you are rolling back all the way to the start of the universe or something, then the longtermist worldview should basically give away all of its influence in all of the smaller worlds that we could end up in, in exchange for getting maximum influence in the world where there’s the biggest infinity number of stars. So, I kind of feel like the longtermist worldview wants to be like a super weird, hardcore philosopher. So, it’s kind of fair to take a slice from it on this basis, because it would have made that agreement, I think, at the beginning, because it’s a very hardcore sort of beast.

Robert Wiblin: Just maximising, yeah.

Ajeya Cotra: But I don’t want to take it to zero because of that. So, I just threw out, in my notes, that maybe like a quarter of the money should be allocated according to these fairness thingies, and only some of that quarter should be the super hardcore, ‘try and think about the beginning of time’ thing, and ‘try and think about if there are more stars or fewer stars than we expected’. And then most of it should be these kind of less principled versions.

Robert Wiblin: Yeah, interesting. Okay. So, just to explain that, you’re thinking that if we went all the way back to the beginning of the universe and we don’t know how much matter and energy there is in the universe, the longtermist mindset would say, “Well, if the universe is 10 times as big, then I want to have 10 times as many resources, because everything is 10 times as important”. And so you kind of want to do it in proportion to the size. I guess then we’re left with like, the universe seems pretty big, but it could be a whole lot bigger. Or I suppose also, we could be earlier in the universe, when we can access more of it. But it has this issue of like… I have no sense of what the scale is, because it feels like it’s just unlimited how large it could be, right? So, I guess it reminds me of that paradox that all numbers are small, because you can just continue adding numbers forever, and even a million, million, billion, trillion is still like smaller than… There’s far more numbers that are bigger than that are smaller?

Ajeya Cotra: No, no, totally. I mean, it’s exactly the St. Petersburg paradox, which is… The probability distribution you probably had going in — of what the size of the universe is and how much matter and energy there is — is you probably had most of your probability mass on smallish amounts. And in fact, our universe as we see it now probably is within the bulk of your probability distribution, but you still assign this long tail to ever bigger numbers, and they dominate the expected value because of the shape of things. Because if you’re pretty uncertain, then you’re just not going to have a sharply decaying tail.

Robert Wiblin: I love this one. If I’d been presenting longtermism at the pub and someone had managed to respond with this one as an objection, I would have incredibly admired that. Is there a difference between fairness agreements and the veil of ignorance approach, or are those just two terms for the same general idea?

Ajeya Cotra: Yeah, I think they are two terms for the same idea.

Robert Wiblin: Nice. Okay. So, what about the ‘outlier opportunities principle’? What’s going on there?

Ajeya Cotra: This is something where Holden might put it differently from me, but I conceptually think of them as the same idea as fairness or veil of ignorance. That whole cluster of considerations is like, if something is doing surprisingly well on its own terms, whatever that means, then it should get some sort of bonus. Because if you imagine they’re sort of like business partners, or like a family, these worldviews — they care about each other. If one of them is in surprisingly great need, then the others would pitch in, is kind of how I think about it. And then the whole question is what is ‘surprisingly great need’ to these different worldviews? And I threw out all these different ideas for how to think about what ‘surprisingly’ means.

Ajeya Cotra: And the outlier opportunities I think is just like a particularly easy version of that, where you’re seeing the empirical distribution of opportunities in each of these worldviews as a philanthropist, since you got into the business, and if something just looks like you’re purchasing so many points of x-risk reduction per dollar versus anything you’ve seen before, then you just kind of want to seize on it. And that’s kind of coming from this impulse that it seems like some of this money should be going to helping out the worldviews that have surprisingly great opportunities, surprisingly great need.

Robert Wiblin: I guess it seems like this would be tricky if you evaluate the mindset on its own terms, because then you could have like, what about a worldview that says, “Oh, if you help just one person, then that’s fantastically good”. And then it does successfully manage to help one person. And so it’s like, “Oh, I’m massively flourishing. This view is kicking ass”. And you say “Well, should we give money to do that?” But it seems like something’s maybe going wrong there. It’s like you have to evaluate it on some slightly higher level.

Ajeya Cotra: I agree. So, I think there’s pretty different types of thinking that sort of determine whether you let something into the family of worldviews that are trying to be nice to each other and cooperating with each other. That’s like one gate, and then the other gate is this fairness stuff that I was talking about.

Ajeya Cotra: So, there are many perspectives on how to do good that Open Phil doesn’t really let in the door. The most salient one is like ‘charity starts at home’, that you should be trying to help people you personally know, or your local community or Americans or something. And so we sort of start off with this goal of trying to be very other-centered and sensitive to scale, relative to whatever else is out there, and just really give it your best shot to be this impartial effective altruist while noticing when there are bridges where you don’t want to go, you don’t want to ride the train all the way to crazy town with all of your capital — although you would want to put some of your capital behind that, but trying very hard to be impartial. And then you let in some set of worldviews, given that, and then they do these more complicated fairness agreements and stuff.

Next best worldviews [00:41:05]

Robert Wiblin: Yeah, that’s interesting. Are there any worldviews that are kind of on the borderline, like you’re not sure whether to invite them to Thanksgiving or not, whether they are part of the family? What would maybe be the next best worldview that’s currently not in the family?

Ajeya Cotra: Yeah. So, I think it’ll be very different for different people. My personal one that I struggle with… I mean, there are two. So the thing I mentioned about improving economic growth as a worldview, I have a lot of sympathy for. And I could pretty seriously imagine, at least for myself, wanting to let it sit at the table. The other one that I think is more borderline and probably no, is something about improving civic institutions. There’s something that attracts a part of me to trying to clean up your own house and be like ‘the city upon the hill’, like some dude said about America back in the day. It’s just kind of like… A part of me feels pulled to improving democracy, and sort of shoring up our self governance, and all this stuff that I could maybe like pencil out through to either the near-termist human-centric worldview, or maybe the longtermist worldview, but the pull I feel isn’t really coming from expecting those to pencil excellently or something.

Robert Wiblin: Yeah, that’s interesting. I suppose one practical argument is that the location you can affect the most is the one that you’re from; that you’re already really embedded in. And then maybe some really important thing is kind of demonstration effects, like showing other places how they can be great. And so if that’s what really matters, then you want to focus on making the place that you are the very best that it can be, so that other places can learn from it. I think that makes some intuitive sense. And then maybe you’re also adding in some kind of contractarian thing where you feel like you’re embedded in some relationship with…

Ajeya Cotra: Yeah, there’s some flavor where it’s kind of like… Yeah. When the George Floyd protests were happening and there were just reams and reams of videos of police brutality in the United States, I was very affected by that, and I was just kind of like… There’s something I feel when something heinous is happening in your backyard. Like, if the EA community were facing a thorny situation with a bad actor, then I would want to put a lot of my own energies that could have gone into doing my job or whatever to try to help with that, if I were in a good position to do so. And that feels kind of similar with the heinous things that are happening in the United States. And so that’s maybe another kind of frame on it, which is kind of contractarian. It’s kind of like, these are my people. I’m kind of responsible for this bad stuff.

Robert Wiblin: Yeah. It’s possible another intuition that’s firing there, and I guess, yeah, maybe this is one way to think about it, it’s like…guiding intuitions that kind of push you to want to have a conclusion. Yeah, the components of worldviews would be like looking at police just beating up peaceful protesters, and you think, “Oh, wow. The military services are the armed forces that are supposed to represent this country, and they’re kind of out of control, and they’re no longer under civilian control, and that historically is incredibly alarming, that tends to end really badly”. And so you’re like, “This is a fire. I have to put out the fire in my house before I continue improving the…”

Ajeya Cotra: Yeah. I mean, I guess anti-authoritarianism is maybe an umbrella you could put on this kind of worldview, sort of like the freedom worldview. We don’t have a lot behind that stuff, like human rights, and freedom of speech, and anti-censorship, and anti-police brutality. There’s something attractive to me from a health-of-the-nation, health-of-the-world point of view to all of those things.

Robert Wiblin: Yeah. One thing that’s attractive about the kind of pro-science or pro-learning worldview is that it allows you to just kind of punt to people in the future who hopefully will be more informed than you. So, you say like, “Look, I don’t know what utopia we want to build, and I don’t really know how to get there, but one thing I can do is add my brick to the wall of just making humanity as a whole wiser and more informed, and that’s really the best that I can hope to do”. And I can definitely see the intuition behind that kind of worldview.

Ajeya Cotra: Yeah.

Robert Wiblin: I suppose all of these other approaches have kind of been trying to avoid the extremism or the fanaticism issue — that you could have one worldview that just dominates all of them. Is there anyone who speaks up in Open Phil for just like, being fanatical?

Ajeya Cotra: There’s definitely a spectrum in terms of how much… If each of us were doing the sort of complicated calculus of thinking about how much to put into each of these worldviews, we would definitely differ in terms of the share that would go to longtermism. I don’t know if we would differ by a ton, I don’t know if the sort of most pro-longtermism and least pro-longtermism have more than a factor of two or something. So, I don’t think there’s somebody that’s like really planting their flag on super, super pro-fanaticism, maybe there is, I’m not sure.

Ajeya Cotra: I think for me personally, I was more pro going all in on the astronomical waste argument before thinking about some of the further weird things that come up as the train keeps moving to crazy town. One of which is the thing that I mentioned about how at the beginning of time, the longtermist worldview probably would have traded off almost all of its influence in almost all of the worlds, ‘almost all’ there being the mathematical definition of almost all, which is like, all but one, or something. And then another one being the various philosophical arguments that suggest that if you believe there is going to be a long-term future, you run into various confusing questions. So, one confusing question is captured in the ‘doomsday argument’, which is like, you should be very surprised to find yourself super early in the history of a very long world, but perhaps much less surprised to find yourself super early in the history of a relatively short world. So, maybe you should think that existential risk is much larger and much more inescapable than you currently think it is.

Ajeya Cotra: And then another is the ‘simulation argument’, which is, if there’s a giant future world and they’re running all sorts of computations, some smallish fraction of their computations might be simulating a world like ours, namely a world that’s potentially on the cusp of space colonisation becoming very large. And so there’s a lot of stuff you run into where you’re like, “Wow, the world is maybe really not at all what it seems like”. And I think after marinating in all of that, I didn’t end up with any particular conclusions that I wanted to plant my flag on or anything, but I sort of was like, “Okay, actually, this line of thinking takes me to a place weirder than I am comfortable with”. And I sort of therefore have sympathy for people for whom the immediately previous stop was weirder than they were comfortable with, and I was more able to listen to the parts of myself that found that uncomfortable.

Pragmatic reasons to spread across different areas [00:47:39]

Robert Wiblin: Let’s come back to that in just a second. But first, what are some other non-philosophical/just purely pragmatic reasons to want to hedge your bets a bit more and spread across different areas? To me, it seems like they’re more persuasive, maybe, than these worldview diversification considerations.

Ajeya Cotra: Yeah. So, I mean, the thing I was saying earlier about how Open Phil as an institution wants to bet on scope-sensitive, deeply thoughtful philanthropy…that hasn’t been done. And we don’t want to have that whole bet ruined and make Open Phil look like a failure because we chose to put all of the capital that could be going to a broad array of thoughtful scope-sensitive philanthropy into one type of philanthropy that has a number of practical disadvantages — such as not being able to learn and correct as you go, or not seeing impact in your lifetime, or having causal attribution be really hard.

Robert Wiblin: There’s also just declining returns, right? So, that’s one reason that you would always want to spread out across different areas, is that you might just find that you’re beginning to struggle to find really good opportunities within any one particular area.

Ajeya Cotra: I think the declining returns reason for diversification applies much more within a worldview than across worldviews.

Robert Wiblin: Explain that.

Ajeya Cotra: So, the longtermist worldview doesn’t ever think that its next dollar should go to something aimed at helping humans in the present, because it thinks the future is e.g. 10–30 orders of magnitude larger than the present. So, even super lottery ticket/bank shot opportunities to help the future — including just saving to wait for something to come up, like in the Phil Trammell paper that was recently released — will pretty much always beat, from within the worldview, giving it to near-termist causes. But the declining returns thing is certainly the reason why we have more than one focus area within each worldview, like why we’re working on both AI and biosecurity. We think that AI risk is probably the bigger problem, but we think that the final dollar that we spend is probably going to be less cost effective than whatever we can do in biosecurity, so we should be doing both AI and biosecurity.

Robert Wiblin: Yeah. So, the way I think about the declining returns thing is, let’s imagine that I have my factory farming/animal-inclusive hat on, and I’m thinking, do I want to go and recommend that Open Phil spend more time trying to do worldview diversification research and try to figure out how it should shift around the fractions that it’s giving to each of these different problem areas?

Robert Wiblin: And I guess with that hat on, I think I’ve got the best arguments. I’m right, and so on average, if they think about it more, they’re going to end up agreeing with me. You might think, well, in the very best case, where they just went all in on this worldview, maybe say I’m 25% of the budget now, I could go up to 100%. But in fact, that’s not even really that beneficial to me, because I already can’t find a way to spend the money that I have now. I’m already running out of opportunities. And so even quadrupling your budget might only accomplish 100% more. And so you’ll be like, “No, let’s just stop thinking about it. I’m happy with the fraction that I’ve got, because that’s actually plenty to do most of what I want to do, and I don’t really want to risk the possibility of losing some of my share in order to have an expected increase”.

Ajeya Cotra: Yeah. I think that makes sense from the perspective of a worldview. I feel like that seems right, but it seems a little bit weird, because it’s basically the worldview being afraid that further thoughtful reflection —which we sort of assume will lead to an increase in better conclusions — is going to lose it money. So, from the perspective of an individual worldview that’s kind of ‘selfish’ within its worldview, then I agree that declining returns means that that worldview is probably more afraid to lose money than it is happy to gain money.

Ajeya Cotra: But that’s a different notion of declining returns than people usually mean when they say that declining returns leads to diversification, because people are usually talking about declining returns from the perspective of the decider. Where you’re sort of like, you can put $100 million into bed nets, but then the marginal bed net is going into this area that has very low malaria incidence. So, at that point, rather than buying that next bed net, you’d be better off funding deworming or you’d be better off doing cash transfers. That’s kind of what I think of as the ‘normal’ version of diversification due to diminishing marginal returns.

Robert Wiblin: Yeah. I guess I could buy it across theories if one worldview really thought that it just had nothing that was positive — it had funded everything that generated good, and anything else would actually in fact be counterproductive. But maybe that’s just too extreme and peculiar a view to take.

Ajeya Cotra: I think it’s very unlikely you land in that place. It seems more likely across the two near-termist worldviews, because the ratio of what seems at stake is less extreme. So, I could imagine the animal-inclusive worldview getting to a place where it spent so much money on animals and made things so much better, that — because it also cares about humans — the next dollar it would spend would actually be aiming to help humans. But I really don’t see it for the longtermist versus near-termist worldview, because of the massive differences in scale posited, the massive differences in scale of the world of moral things posited, and because the longtermist worldview could always just sit and wait.

Robert Wiblin: Yeah, that makes sense. I guess I’ll just note that philosophy PhD student Hayden Wilkinson recently wrote this paper called In defence of fanaticism, which unfortunately we haven’t had time to read closely because it’s a little bit technical. But we’ll stick up a link to that for readers, if they’re curious to see. I guess he claims, in the introduction to the paper, that almost nobody has ever defended fanaticism, which in moral philosophy is just going all in on one perspective and ignoring other considerations. He claims that almost no one’s defending it, but he wants to stake out the territory of defending it and saying, “Actually, this is more reasonable than hedging your bets”. Maybe we’ll be able to talk about that paper at some point in the future once I’ve actually read it.

Robert Wiblin: Has Open Phil as an organisation made any big strategic shifts in the worldviews and relative weights that it gives since… I guess you had a couple of different blog posts about this back in 2017, I think. People might be curious, what is the upshot of things you’ve been learning since then for what Open Phil is actually going to fund?

Ajeya Cotra: Yeah. I mean, I laid out this worldview diversification question in terms of these three main worldviews and these sort of high-level philosophical considerations for how much to give to each of these worldviews — do you weight by credence, or do you give all of it to the worldview that says it has the most at stake, or do you do fairness agreements to make trades between worldviews? Since then, I think we’ve moved into a bit more of an atheoretical perspective, where there may be a larger number of buckets of giving, and each of those buckets of giving will have some estimate of cost effectiveness in terms of its own kind of native units.

Ajeya Cotra: So, there may be many buckets of giving in the near-termist human-centric side, like giving to GiveWell top charities, or giving to Target Malaria, or things like that in the zone of ambitious sort of science projects to eradicate disease. Or giving to improve policy in developing countries. And we would try to math all of those out in terms of disability-adjusted life years per dollar, or maybe effective cash transfers per dollar or something. But they would each have different properties, in terms of these intangible properties that were partly driving our desire to do worldview diversification in the first place — such as like, being subjectively weirder and more speculative and less commonly considered to be a good goal, or having worse feedback loops, or having a higher risk of self delusion, or feeling more like a Pascal’s mugging. These are all kind of tags you could put on different buckets of giving. And even within one worldview, these tags might be different for different buckets.

Ajeya Cotra: So, you can imagine that improving vaccine manufacturing capability in a way that would help us prevent future COVIDs — or help us make future COVIDs better, and also, potentially, help with something much worse — could be one sort of bucket of longtermist-oriented giving that seems pretty good, in terms of seeming like a subjectively good goal to most humans, having pretty good feedback loops because you can see what’s happening with the vaccines as you’re going, and having a pretty low risk of self delusion because you’re deferring a lot to experts in how it’s actually done and so on. Versus something like funding really unproven EAs to attempt to get a machine learning degree to try and do AI safety research. That seems like it has a higher risk of potentially self delusion because there are no experts in this area to defer to, and it has worse feedback loops, potentially. And there are things you could imagine that have even worse feedback loops, like maybe creating bunkers for people to hide out in if there’s a bio disaster or a nuclear disaster.

Ajeya Cotra: And so basically, the rough thought here is that there are some things in each of the three big worldviews that perform really well on these subjective intangible things, that make it so that they’re an easy sell to ourselves. And then there are others that perform worse on these intangibles. And just empirically speaking, we feel more comfortable doing the things that perform worse on intangibles when they pencil out to be better on their own terms, within their worldview bucket.

Ajeya Cotra: So, we’re kind of moving into something where we sort of talk tranche by tranche, bucket by bucket, and there might be 10 or 20 of these buckets as opposed to these three big worldviews, and we try to do our best to do the math on them in terms of the unit of value purchase per dollar, and then also think about these other intangibles and argue really hard to come to a decision about each bucket.

Effective size of the long-term future [00:57:19]

Robert Wiblin: Let’s move on to another cluster of research that you did while generally thinking about how Open Phil should allocate its money among different issues.

Robert Wiblin: Earlier we were talking about the longtermist worldview versus other worldviews. I guess one key part of figuring out how much to weight the longtermist worldview is to think, well, how big could the future be? How much benefit could you create in the long term?

Robert Wiblin: I suppose on a more simple longtermist view you could think, well, the size of the future might be that humans continue to live on Earth as they have for another billion years until the sun ultimately expands and kills everyone. But I suppose the future potentially might be a whole lot bigger than that, which is one reason that potentially you want to give the longtermist view a lot of weight. Now, do you want to go into that?

Ajeya Cotra: So the basic astronomical waste argument (Astronomical Waste by Nick Bostrom is the seminal paper of this longtermist worldview) essentially says that there’s a very good chance that we could colonise space and create a society that’s not only very large relative to what could be sustained on Earth, but also very robust, and having a very low risk of extinction once you cross that barrier.

Ajeya Cotra: We actually think that’s a pretty important part of the case for longtermism. So, if we were imagining longtermism as just living in the world, where humanity will continue on Earth and things will happen, and it’ll be kind of like it is now, but it might last for a long time, so there may be many future generations… We’re not convinced that’s enough to get you to reducing existential risk as your primary priority.

Ajeya Cotra: Because in a world where there isn’t a period where we’re much more technologically mature, and much more able to defend against existential risks, the impact of reducing existential risk today is much more washed out, and doesn’t necessarily echo through all of the future generations, even if there are many of them on Earth.

Ajeya Cotra: So I was looking into that general question of whether we will have a large and robust low x-risk future in space.

Robert Wiblin: And what are the component questions of that?

Ajeya Cotra: There are basically two parts to this project. The first part was a brief literature review/interviews about the kind of technical feasibility of space colonisation, plus much reduced existential risk worlds. And I was looking into all sorts of things, like how many stars are there? And how fast could our spaceships be? And how much mass could they carry?

Ajeya Cotra: I was trying to find the most defensible or most conservative assumptions that would still lead to this low x-risk space colonisation world, in terms of could we have biological humans colonise a small number of planets, and could that be a sort of stable low x-risk world? I ended up deciding that’s actually fairly dicey to defend.

Ajeya Cotra: The most conservative assumptions that robustly lead to this big, safe world in space go through humans being uploaded into computers, and those computers being taken to space — as opposed to the biological bodies being taken to space and trying to alter planets by terraforming to make them habitable for biological humans. So I thought that was an interesting upshot.

Robert Wiblin: What’s the reason for that? Why is it that the whole ‘humans go to other planets and live there’ thing either isn’t possible or wouldn’t be sustainable for very long time periods?

Ajeya Cotra: I don’t know that I strongly think it isn’t possible. It just seems like there are a lot of questions that go away if you make the one assumption about uploaded humans.

Ajeya Cotra: The people I talked to were kind of like, “Maybe we could swing that. Maybe we could have biological humans in big spaceships traveling to these other planets”. But there are a lot more questions about sustaining life on the spaceship and finding planets that are suitable for terraforming then there are about preserving these computers and finding planets that are suitable for building computers on.

Ajeya Cotra: A big one is just that the spaceships need to be huge to support these human colonies, and feed them and everything. And these huge spaceships, first of all, require a lot more materials to build, and you might not be able to build as many of them. They might be much more fragile in terms of, you have this huge surface area you have to protect from stuff like space debris and things, especially if you’re going very fast.

Ajeya Cotra: And you might, with these smaller spaceships, be able to send redundantly many, many of them, so that if some of them get destroyed, it’s okay with a fraction of the material you would have spent on the big spaceship. Stuff like that.

Robert Wiblin: I mean, this isn’t something I’m an expert in, but I love to speculate about it. I guess my…

Ajeya Cotra: I don’t consider myself an expert either. I wrote down things that people said, I read some papers.

Robert Wiblin: Yeah. I guess, in that spirit, my impression is that I haven’t heard a good reason to think that in the fullness of time, it wouldn’t be possible for humans to settle Mars, and make Mars habitable. And potentially some other places within the solar system, places where actual flesh-and-blood humans could live and continue to procreate and be self-sustaining.

Robert Wiblin: But yeah, once you’re talking about going to other stars and finding planets there, it gets a lot more dicey whether it’s possible. I guess, just because humans are really not designed for space travel, that is not what we evolved to be capable of doing. We need lots of space and lots of resources.

Robert Wiblin: Yeah, so if you can get all of those materials into some big spaceship, now you’ve got to go a very long way. And the amount of energy required to move something that would be such a big ship, that would be large enough to have a self-sustaining group of humans for, I guess, thousands of years…that’s a lot of material.

Robert Wiblin: And then you’ve got this trade-off between… You could try to go really fast. That’s very energy-requiring. And you also have this issue that you run into dust on the way, and dust would eventually pelt and potentially break down the ship. It’s actually a very big issue then. Because you’d want to go as close to…

Ajeya Cotra: It’s a huge issue.

Robert Wiblin: …you’d want to go as close to light speed as possible. That’s hard enough in the first place. And that obviously makes it a bit easier, because you won’t be there for as long, and you don’t have to keep the ship turning over so many generations of humans on the trip. And I guess also time slows down when you go especially fast.

Robert Wiblin: But then the ship just gets pelted and disintegrates because of dust. Which is one reason that people thought, well, if you want to go to other stars, what you want to do is send 1,000 tiny little ships, because they have a chance of, just by good fortune, not hitting dust in the intervening space, even though they’re going incredibly fast and dust would blow them up.

Robert Wiblin: So you not only have to have enough material that you’d have a huge ship where you can have a self-sustaining population of I guess 1,000 people — because otherwise they’ll become inbred and non-functional — you also then have to have all the resources to terraform a planet once you get there and make it viable for humans. And it could be completely different kinds of planets. So, it seems hard.

Ajeya Cotra: I mean, I think the thing that cinches it for me — in terms of why I really don’t want the longtermist case resting on biological colonisation — is that I’m really not sure the economics of it would work out. It seems like colonisation with biological humans would require much more motivation on the part of the home planet to make it happen than the smaller space probe colonisation, or colonisation with computers, that could be done with one motivated company potentially, once we have the technology.

Ajeya Cotra: So I think that’s also part of it for me, where I’m kind of like, not only does it seem technologically dicier, partly because of that it also seems like I’m not sure that I can really tell someone, “Hey, this is probably going to happen,” if this is the only way it can happen.

Robert Wiblin: So your point there is that settling of a star system is not like Europeans going to the Americas. No one’s bringing back any silver or gold anytime soon, on a economically-plausible timescale. So you have to want to do it just for its own sake. And then who’s going to fund this if it’s costing several years of global GDP to do it?

Ajeya Cotra: Yeah. I mean, we might mine asteroids and stuff. But in terms of actually diversifying off of Earth, and getting a big dose of the more robust future/permanently reduced x-risk part of the story, then I think you’d have to be much more motivated as a civilisation if it had to be biological versus if it could be computers.

Robert Wiblin: Yeah. Okay, tell us about the computer future. Can we send self-replicating computers to other star systems? Is that likely?

Ajeya Cotra: Yeah. I mean, there’s not a lot of literature on this and not a lot of people who’ve thought about it.

Robert Wiblin: I’m shocked to hear that.

Ajeya Cotra: I love this paper out of FHI called Eternity in Six Hours. It’s just very fun, sci-fi almost.

Ajeya Cotra: I certainly think that people in the effective altruist community who have thought about this seem quite bullish about it — like about small ships that are traveling meaningful fractions of the speed of light, and have these onboard computers. And these computers are able to land on planets and do what they need to do to build more computers on that planet, and build more ships and send them out.

Ajeya Cotra: And it’s not like I found any particular devastating counterargument to that or anything. I think I am more uncertain than the people who are most into this, like the authors of the Eternity in Six Hours paper.

Ajeya Cotra: But it seems to me like there’s a broad range of technological sophistication levels that (once you assume the ability to upload humans into computers) feel like the big interesting uncertainty. After that point, it feels like you don’t need massively more sophisticated technology than we have now to get the lower end of the ability to colonise other planets, other stars.

Ajeya Cotra: To get the shock wave expanding over the entire observable universe, you have to assume these more intense technological capabilities. Which seems possible, but I wouldn’t blame you if you were more sceptical of that. But I think the lower end doesn’t involve a ton of innovation. And that was something that I learned from this project.

Robert Wiblin: That’s interesting. You’re saying that we’re within firing distance of potentially being able to send computers to other stars, if we were really willing to do what it took.

Ajeya Cotra: Yeah.

Robert Wiblin: But then, would they be able to do what it takes? I guess I’ve heard that, rather than go to planets, which is kind of hard, they would probably want to go to asteroids and then grab resources from the asteroids and turn those into copies of themselves?

Ajeya Cotra: Yeah.

Robert Wiblin: Yeah, I guess that’s the thing that I’m most unsure about — it feels to me like the kind of technological problem that, if humanity had thousands of years, it could figure out. But it seems like it’s a slight heavy lift, given the fact that it’s a bit hard to send that much.

Robert Wiblin: It’s hard to send a full industrial base to another star system, because there’s so much material and so much dust in the way. So it’s like, can you squeeze through just enough that it can get off the ground at the other end?

Ajeya Cotra: One update I made from this is that — and this was a long time ago, people since have looked into space colonisation at Open Phil, and I haven’t dug into their work on it — but my impression is that most of the technological uncertainty is on the software side of things, versus the spacecraft side of things or the energy side of things. For what I’m calling the ‘conservative astronomical waste story’, where the spacecraft don’t actually need to be that fast or that tiny.

Ajeya Cotra: I think that the big question is, can we create these computers on moderate amounts of computing hardware that could fit on, I don’t know, a golf-ball-sized craft, or maybe a soccer-ball-sized craft? Can you basically embed artificial intelligence on there and robotics that’s flexible enough to be able to make do with what it happens to find on another surface?

Ajeya Cotra: And that feels like more of the uncertainty to me than the question of can we make some kind of spacecraft work to colonise some stars. Not the big colonise-the-universe thing, but still enough to dramatically reduce existential risk, because you’re spread out more.

Robert Wiblin: Okay. Let’s wrap up on this empirical bit. I guess the bottom line is, having looked into this you thought, if you’re willing to buy that sending artificial intelligence or uploaded humans in some form to other star systems to do their thing would be valuable, the possibility that we can’t get to other star systems doesn’t reduce the value of the far future or the long-term future that much. Because it’s 50/50 likely that we can do it, and so that only halves the value or something.

Robert Wiblin: So it’s not going to be the big reduction factor that you would need to say, “Oh no, actually, the future’s not that big in expectation”.

Ajeya Cotra: Yeah.

The doomsday argument [01:09:37]

Robert Wiblin: But there’s other arguments that potentially could do more heavy lifting. And you mentioned them earlier, the ‘doomsday argument’ and the ‘simulation argument’. Maybe just lay out the doomsday argument, and, I suppose, how persuasive you find that line of reasoning? How much can it do to shrink the expected size of future life?

Ajeya Cotra: Yeah. So the doomsday argument is basically that if you find yourself on apparently the cusp of the ability to colonise space according to the previous research, you should be very surprised if there seems to be a very long future ahead of you as a civilisation, and you find yourself at the very earliest bit.

Ajeya Cotra: So in other words, let’s say God flips a coin at the beginning of the universe. And He either makes 10 boxes labeled one through 10, each of which has a human in it, or He makes 10 billion boxes labeled one through 10 billion, each of which has a human in it.

Ajeya Cotra: After he does this, you wake up in a box, and you walk outside to see that your box is labeled ‘Three’. So the intuition that it’s trying to elicit is “Oh, probably we landed on the 10 boxes side instead of the 10 billion boxes side”. Because if it were the 10 billion boxes, I should have seen e.g. 7 billion as my number, rather than three.

Ajeya Cotra: The argument is, that’s what you should believe in the case of the boxes. And the question of whether our world has a big, bright future or will be wiped out quite early in its history is like God flipping that coin and creating either 10 boxes or 10 billion boxes.

Ajeya Cotra: So finding yourself early in history should make you think there’s actually a lot more existential risk that’s a lot more intractable than you thought. And humanity isn’t going to have a long future, and there’s not much you can do about it.

Robert Wiblin: Yeah. So I guess depending on how far back you think humans go…which I guess is a bit of a messy question, because it was just a continuous gradual evolution into being the humans that we are now… But maybe there’s been like 100 billion humans ever at any point, and so we think we’re at about 100 billion.

Robert Wiblin: And the question is, if there’s going to be 1 trillion humans, then it’s not that surprising that we would find ourselves in the first 100 billion. But if there’s going to be 1,000 trillion, then it’s starting to look as though it’s a bit odd, it’s a bit like drawing the box labeled ‘three’.

Robert Wiblin: So maybe that’s a reason to think that there won’t be that many people in the future. Because if there are going to be so many, then what a coincidence that we should find ourselves at this incredibly early stage.

Robert Wiblin: Is this sound reasoning? What do philosophers make of this? Because it feels like it’s proving too much just by… I mean, you haven’t looked at almost anything, and you’re managing to prove that we’ll destroy ourselves on the basis of pure theory.

Ajeya Cotra: Yeah. I mean, I’m definitely suspicious of things that have strong conclusions about what kind of world we’re living in from pure philosophy. But I actually think both sides of this debate end up having something like that.

Ajeya Cotra: So this doomsday argument relies on what’s called the ‘self-sampling assumption’. These are extremely confusingly named. But basically, the doomsday argument gets its weight from the assumption that before you look at your box, you should have been 50/50 on whether God flipped the ‘small world’ coin or the ‘big world’ coin.

Ajeya Cotra: And because before you looked you were 50/50, then you make a massive update towards the ‘small world’, because you’re sort of an early number. And that intuition is coming from, well, God flipped a coin. So it’s 50/50.

Ajeya Cotra: The other perspective you could take on this is that before you look at your number, there should be a massive update already in favor of being in the ‘big world’, just because you exist. There are more people existing and having experiences in the world with 10 billion boxes than in the world with 10 boxes. So then when you look, you are back to 50/50.

Ajeya Cotra: That’s roughly the two assumptions. ‘Self-sampling’ is the first one, and ‘self-indication’ is the second one. But I never remember that.

Robert Wiblin: Yeah. Okay, so basically the idea is that before you get out of your box, if it’s the second world, there’s 10 billion people?

Ajeya Cotra: 10 billion boxes. Yeah.

Robert Wiblin: Yeah, okay. So there’s that many boxes, and so you’re like, “Well, there’s way more people in this world. So I’m far more likely to be there”. And I guess it’s doing something like not taking for granted that you would exist. It’s imagining 10 million versus 9,999,990 empty, where there is no person.

Robert Wiblin: I guess it fits my intuitions that I would think I would be more likely to be in the world where there’s more people. What do you make of that?

Ajeya Cotra: So, I think that is pretty reasonable, but it also leads to this kind of presumptuous… The thing you were saying about how you come up with this really confident theory about what kind of world you’re living in based on pure philosophy applies to this approach, but in the other direction.

Ajeya Cotra: So it’s less weird because you know about God flipping the coin. And you kind of want to end up at 50/50 after looking at your box in this thought experiment. But the thing where you massively update in favor of being in the 10 billion boxes world over the 10 boxes world, when you take it to the real world, can be applied to basically be, you don’t need to look at any physics to know that our world is spatially infinite.

Ajeya Cotra: You just know that, because you exist. Since there was some chance that the world is spatially infinite, and there’s infinity update in favor of being in the world with infinite people. And so you’re just like, “I don’t care what the physicists say. And I basically don’t care how much evidence they find that the universe is finite”.

Robert Wiblin: I see. So you can either be presumptuous in the first case, or you accept this other presumption that’s like, I can deduce from pure theory that the universe must be enormous —indeed infinitely large — because that’s a world with so many more people in it, and so I’m far more likely to be in that one. And so you’ve got to bite one of these bullets, or accept one of these on unpleasant conclusions.

Ajeya Cotra: You could even take it further, because this thing about being a person is kind of under-defined, right? So you could be like, “Actually I know with extreme confidence that there’s infinite Rob Wiblins experiencing this exact thing, because that was physically possible”.

Ajeya Cotra: “And I have an infinity-to-one update that the world is just tiled with Rob Wiblins having Skype conversations with Ajeya right now. Because I would be most likely to be experiencing what I’m experiencing in that world”.

Robert Wiblin: Does that end up being just the same as saying that the universe is infinite? I suppose it has to be infinite and not obviously repeating just some identical pattern that doesn’t include me in it.

Ajeya Cotra: Well, but there’s infinite universes with different densities of Rob, right? Some of them have more Rob, they happen to have more Rob, some of them. Like the physics is arranged such that Rob is a really common pattern.

Robert Wiblin: So I look out at the night sky and I see all these stars, but really I should be very overwhelmingly confident that that’s an illusion. And in fact, all of that space is actually full of me having conversations with you.

Ajeya Cotra: Yeah.

Robert Wiblin: I see. That does seem counterintuitive. I was going to say that the presumptuous philosopher who thinks that the universe must be infinite… I guess we’ve got some semi-confirmation of that by looking out at a universe that seems like it could be infinite, or at least we don’t have strong reason to think that it is finite, based on the evidence that we have. So maybe they might get a pass on that.

Robert Wiblin: But yeah, the idea that it’s absolutely tiled very densely with us having this conversation is a far from appealing conclusion. So where do you fall on this?

Ajeya Cotra: I am more on the side of the presumption to big. So I’m more on the side where when you wake up you think that you’re in the 10 billion boxes world, and then you update back to 50/50 or whatever. But I do think that’s because I kind of want to end up with a normal conclusion, and I don’t love the thing that I just said about this solipsist conclusion, basically.

The simulation argument [01:16:58]

Ajeya Cotra: So that leads into the second weird argument about how the future might be small, or why the ratio between the future and the present might be smallish, which is the simulation argument.

Robert Wiblin: Right, so I guess we have a bunch of things that all fit together a bit here. So we’ve got this idea that, oh, we could influence a whole lot of people. But then we’re like, it doesn’t feel right. It feels like this has to be overconfident somehow.

Robert Wiblin: And then the doomsday argument is one reason to think that, well, the future can’t be big. And then it was like, well, how would that be? And then I guess the simulation argument potentially offers an explanation for how that actually would fit with our observations.

Ajeya Cotra: Yeah. So the doomsday argument assumes that there’s really high, unavoidable x-risk or something. And that’s why the future is small. But the simulation argument takes it in another direction.

Ajeya Cotra: The simulation argument says: Grant that there’s a big future, with all these computations running, and a large number of flourishing humans in space, running on computers. In that world, then, some small fraction of their resources might be spent simulating worlds like ours — namely worlds where humans are on one planet, and they seem to be maybe on the cusp of colonising space in this way, in the next several decades.

Ajeya Cotra: Then in that case, if such simulations are even a pretty small fraction of the resources of this presumed giant future, like one in 1 million of the resources, or one in 100,000, then almost all of the people having experiences like ours are in simulations rather than in ‘the real world’.

Ajeya Cotra: And the ratio of the value of what we think of as the future and what we think of as the present is basically bounded by one over the fraction of the resources in the future spent on simulation. So if they spent 0.1% of their resources on simulations, then the ratio of the value of the future to the value of the present is at most 1,000.

Robert Wiblin: Hmm. Okay. So you could have different arguments in favor of the simulation argument. So one would just be, say, you think it’s very intuitively plausible that we are going to go out and settle space and capture all this energy and have lots of very fast computers.

Robert Wiblin: And if we did that, it would be very plausible that we would simulate a time just like this very frequently, so that most of the people in a situation like the one in which we find ourselves are in simulations rather than the original — I guess people call it the ‘basement universe’. The one that’s originally not simulated.

Robert Wiblin: Another angle would be to say, “It’s so suspicious that we look up at the sky and there’s so many stars, so much space and matter that is being put to no use, and it seems like we could just go and use it. That would have the extreme implication that we’re at this very special time, and we have this potential to have enormous impact over lots of other beings. That can’t be right. So I want a debunking explanation that makes the world seem more sensible”.

Robert Wiblin: And that explanation is going to be that, well, for whatever reason, even if I didn’t think it was super plausible that future people would want to run a simulation of something as boring as this podcast, I’m going to think that anyway. Or I independently have a reason to think that they are.

Ajeya Cotra: Yeah, that seems right. I think people have gotten there through both forks. I think I’m a little bit more on the side of seeking a debunking explanation for what seems like this enormous amount of value lying on the table.

Ajeya Cotra: And I also have some, for myself, of the Fermi paradox. ‘Why aren’t there aliens’ would also be something that either the doomsday argument or the simulation argument could debunk.

Robert Wiblin: I see. Yeah, that makes sense. Okay, so you’re saying that if it’s so surprising that the universe seems barren of other life, even though we’re here, the doomsday argument would say, “Well, we’re not going to be here for long, and neither was anyone else”. And the simulation argument would say, “Oh, it’s because we’re in the Truman Show and it’s not a real sky. It’s a make-believe sky that they’ve put up there to entertain us”.

Robert Wiblin: Okay. I think some people get off the boat with this a little bit because people start explaining why it is that future super-civilisations — potentially harvesting the energy from suns — would want to run a simulation of what we’re doing. And that starts to sound pretty kooky. Did you spend very much time thinking about the different rationales that people provide for why we would be here?

Ajeya Cotra: Yeah. I mean, so it kind of comes back to fanaticism a little bit, and how probabilities are weird, and Pascal’s muggings are weird. Even if you think it’s really unlikely, are you going to think it’s like a one in 10 to the 40 chance, if you previously thought there were 10 to the 40 persons in the big long-term future?

Ajeya Cotra: Because if you’re not willing to go there, if you’re going to say it’s a one in 10 million chance, you’ve brought the ratio down of the future to the present from 10 to the 40 to 10 million, right? So I think it’s hard.

Ajeya Cotra: I kind of share your skepticism of, why would people be simulating us? And I kind of, in this project, went down a long rabbit hole about why that might be. But I think the ultimate thing is that we came up with this argument. We can imagine ourselves simulating our past, for whatever reason. We can imagine one in 1 billion of us being crazy historical replicationists or whatever, like Civil War reenactments. It just takes like a pretty small fraction.

Ajeya Cotra: It’s quite hard to say that my probability of non-trivial amounts of simulation is anywhere in the range of one over the size of the future. And that’s what it takes for the argument not to bite.

Robert Wiblin: I see. Okay, so some people might just get off the boat and say, “I just don’t want to engage in this sort of reasoning at all. This is too much for me and so I’m not going to use this style of reasoning”. But I guess we’re not those people. So we have to answer the questions, like how much does this deflate the expected size of the future? We have to do something about, well, how likely is this kind of reasoning to be right?

Robert Wiblin: Is there some other very similar-ish argument that we haven’t yet thought of that would also demonstrate that we’re in a simulation for some reason, or a reason that there would be lots of people in the situation that we’re in, that we haven’t yet thought of? Or maybe there’s a reason why they wouldn’t do it, and so there won’t be so many? But then it seems like your entire worldview hinges on this wild speculation about this universe that you can’t see and beings that you don’t know. And how many of you they would want.

Ajeya Cotra: Yeah. I mean, it dampens the ratio of what you believe to be the long-term future to what you believe to be the present by a lot. But it also implies that most of the impacts of all of your actions — longtermist and near-termist — are to do with how you change the resource intensivity of the simulation you’re in, and what they would have used the resources for otherwise and stuff.

Ajeya Cotra: So, it’s kind of like the astronomical waste argument, in the sense that it re-frames all of your random actions as only mattering in that they impact the probability of getting to the long-term future. But it just replaces the probability of getting to the long-term future with whatever effects you have on the outside universe.

Robert Wiblin: Hmm. Okay. So how does this actually affect your estimates of what we ought to do? Is there any way of summarising that? Or is this an area where you just kind of threw up your hands a bit?

Robert Wiblin: I guess my take on this has been, it should affect it somewhat, but I don’t really understand exactly how. And it doesn’t seem like it’s such a strong argument that I’m going to stop doing what I was going to do before I encountered this argument.

Ajeya Cotra: I’m basically in a very similar place. I think I was at my peak astronomical waste fanaticism before doing this project. And then after doing this project, I was like, “Well, I’ve kind of discovered my limits in terms of where I get off the train to crazy town”.

Ajeya Cotra: I am going to be living at the stop where I take astronomical waste very seriously, I take the idea of us being in a critical period of existential risk very seriously, I’m probably going to spend most of my energies working on that. But I realise that there’s a lot more out there in terms of, if you really have the goal to take philosophy as far as it goes, there’s more stops past me. And I don’t know where they lead.

Ajeya Cotra: Infinite ethics is another good one here, in terms of finding a non-broken version of total utilitarianism that works in an infinitely-sized world that avoids these paradoxes. I don’t know if that’s possible.

Ajeya Cotra: So I think I was kind of humbled by it. I basically gave up on this project because I was like, I don’t want to get this to a publishable state and put in all the work it would take to go down the rabbit hole enough to come to particular conclusions on its basis.

Ajeya Cotra: But I do have more empathy for people who had the intuition that the astronomical waste argument is really weird. I sort of also had that intuition. And then I was like, “I’m being silly. It’s a really strong argument. I’m just being scope insensitive or something”.

Ajeya Cotra: But actually one lesson here is just… Assuming a big world allows for a lot of crazy things to happen, and allows for a lot of sort of trippy questions to be raised and stuff. And so it is, in fact, a very weird and bizarre argument that opens the door to a lot of other weird and bizarre arguments.

Robert Wiblin: Yeah. I suppose I don’t think of these arguments as crazy. I don’t want to be like, “This is crazy”. I feel like I maybe stopped thinking about it, not because it’s too weird for me, because I’m into plenty of weird stuff. But more just, I couldn’t really see how me analysing this was going to shift my behavior. Or how I was getting traction on this from a research point of view, or from a practical point of view.

Robert Wiblin: Because one difference I feel is… The astronomical waste argument, or the idea that, oh, the long term matters a lot because it could be very big, or it’s people having really good lives… I feel like that cashes out in something that I can understand about how it might affect what I want to do.

Robert Wiblin: Whereas the simulation argument — and I guess to some degree, the anthropics-related uncertainties around the doomsday argument — they just feel so slippery. And I’m just like, “I don’t know where this leads, and I’m not sure where I would even begin walking”.

Robert Wiblin: And so it’s very tempting, I guess, for me — and I suppose many other people have probably done this as well — to encounter these arguments and just be like, “I kind of give up, and I’m going to stop where the tractability ended for me”.

Robert Wiblin: But I guess I would love it if someone else much smarter than me could figure out what this actually does imply for me in my life. And then maybe I would take that seriously.

Ajeya Cotra: I would definitely be interested in funding people who want to think about this. I think it is really deeply neglected. It might be the most neglected global prioritisation question relative to its importance. There’s at least two people thinking about it on a timeline. So zero people, basically. Except for Paul in his spare time, I guess.

Robert Wiblin: Yeah. Paul Christiano, that is.

Ajeya Cotra: I think it could have implications… Yeah, Paul Christiano.

Ajeya Cotra: I think it could end up having implications for how we think about AI, and how worried we are about misaligned AI and stuff. I don’t know exactly.

Ajeya Cotra: If you imagine we’re being simulated for some purpose by an outside universe, then do we want to align artificial intelligence with our goals? Or are we mostly trying to think about why we were simulated, and use AI to help us figure out how to give the outside world whatever it is they wanted? Or should we even be cooperative with the outside world?

Ajeya Cotra: And if we should be cooperative with the outside world, then does that meaningfully change how upset we are about misaligned AI? It might be misaligned with us, but we want to think about what it means for it to be aligned with where all of the value actually lies, which is the outside basement universe, or whatever.

Ajeya Cotra: I think it could have implications. I’m not the person to think about them, but I would be very excited for other people to think about them.

Robert Wiblin: Yeah. I guess another line of argument that people have made is, well, let’s assume that we are in a simulation for the sake of a hypothetical. That means that there must be some reason why they’ve decided to run this simulation. And it means this world must be kind of interesting in some way. People write really funny tweets, and I guess Wikipedia’s good, and Netflix produces some good shows…but that seems like it’s not quite enough for them.

Robert Wiblin: So, what would be the reason? And then we’re like, “Well, maybe we should go look around. What actually would be of interest?” And then I suppose people would say, “Well, maybe we’re at the cusp of some really important moment in history. Something that really would be of interest to future generations”. And they’d be like, “Well, it could be a massive war or it could be development of new technology that’s super revolutionary”, of which I guess AI and some biotech stuff might be on the list.

Robert Wiblin: It seems like you’d be more likely to simulate something that was really historically epically important. And so that’s a reason to expect, if we are in a simulation, for it to be more interesting in some way. And then maybe it can be like, “Well, what does that imply?” And I don’t know.

Robert Wiblin: Alright. So I guess we both find this interesting and slightly exasperating. I would be very happy for someone else to write papers that then would get us off the hook for thinking about it anymore.

Ajeya Cotra: Yes. I’m very excited about people… I think it’s rare to find someone who has both the capacity and the stamina or patience for this kind of thinking, so I think it’s quite neglected and could be really high impact if you find yourself excited about it.

Robert Wiblin: Yeah. Alright. Well, we’ll stick up links to the simulation argument paper, and I guess a few blog posts that flesh out possible consequences of possible things that we might infer from it, for people who want to go and explore that.

AI timelines report [01:29:24]

Robert Wiblin: Let’s talk now about another whole area of research that you’ve been working on over the last two years, which has been this report called Forecasting Transformative AI With Biological Anchors. What was the goal of that project, and why did it matter?

Ajeya Cotra: Yeah, so this project came about because potential risks from advanced AI is a major Open Phil focus area. And one of the important considerations feeding into how much we should be allocating to risk from AI versus other longtermist goals — and also potentially how much we should be allocating to longtermism as a whole versus near-termism — is how urgent the problem of AI risk is, and how soon is it on the horizon, and how much can we anticipate now, and do things that we can expect will affect it without being washed out by a whole bunch of stuff that happens in between. So how soon really powerful AI systems are going to be developed is an important strategic question within the longtermist worldview, and sort of indirectly like an important question about how urgent the longtermist worldview as a whole is, and therefore how much weight it should get versus the near-termist worldview.

Robert Wiblin: Okay. So the goal is to try to figure out what’s the likelihood of us having transformative AI by different dates, I guess to figure out how urgent it is to try to make sure that that goes well? Because if it’s not going to come for another 100 years, then we can potentially punt that to another generation or another philanthropic organisation to figure out in the future?

Ajeya Cotra: Yeah.

Robert Wiblin: And what was the approach that you took, and I guess how did it evolve over time?

Ajeya Cotra: Yeah, so some quick background on this project on Open Phil’s side: In 2016, Holden wrote a blog post saying that, based on discussions with technical advisors who are AI experts — who are also within the EA community and used to thinking about things from an EA perspective — based on discussions with those technical advisors, Holden felt that it was reasonable to expect a 10% probability of transformative AI within 20 years. That was in 2016, so that would have been 2036. And that was a kind of important plank in the case for making potential risks from advanced AI not only a focus area, but also a focus area which got a particular amount of attention from senior generalist staff. So that there are a number of people thinking about aspects of AI at Open Phil.

Ajeya Cotra: That was roughly around when we opened up a focus area, and decided to make it a particular focus of senior staff, and we’ve been in that area for a few years now. And then in 2018/early 2019, we were in the middle of this question of we’re hoping to expand to peak giving consistent with Cari and Dustin’s goals to give away their fortune within their lifetime, and we want to know which broad worldviews and also which focus areas within the worldviews would be seeing most of that expansion. And so then the question became more live again, and more something we wanted to really nail down, as opposed to kind of relying a bit more on deference and the earlier conversations Holden had.

Ajeya Cotra: And so digging into AI timelines felt like basically the most urgent question on a list of empirical questions that could impact where the budget went…

Robert Wiblin: …when and how much.

Ajeya Cotra: Because… Yeah, and potentially also how we should broadly strategise about what we do with the money within the AI focus area. Because it matters what tactics we take. What tactics pay off over what time scales matters for what we’d prioritise within that area.

Robert Wiblin: Alright. If I was a better person, we would eat our greens before we had our dessert, and we would walk methodically through all of the methodologies you’ve used — and the various pros and cons — before we got to any conclusions. But I’m not a patient person, so it’s not going to take forever to get there. What were your bottom line conclusions about timelines, in brief? Did you end up thinking transformative AI may come sooner or later, or maybe even that it’s just like maybe not even possible?

Ajeya Cotra: Yeah. I think the methodology I used is a little bit more robust for medians rather than either tail. So my median — depending on how I’m feeling on a particular day — ranges anywhere between 2050 and 2060 in terms of, “I have this model and there are some parameters I’m particularly angsty about”. So that’s between 30 to 40 years from now. And I think that’s a quite extreme and stressful and scary conclusion, because I’m forecasting a date by which the world has been transformed. I’m imagining a lot is happening between now and then, a lot of wild stuff in 10 years, a lot of wild stuff in 20 years, if the median date is 35 years for fully transformative AI.

Robert Wiblin: And I guess it could also come sooner or it could also come later. So there’s uncertainty, which I guess also might make you nervous.

Ajeya Cotra: Yeah.

Robert Wiblin: Is that sooner or later than what you thought before you set out on this project?

Ajeya Cotra: The probability by 2036, where Holden had originally said at least 10%, I’m bouncing between 12% and 15%. So it’s definitely consistent with the at least 10% claim. With that said, the at least 10% claim was trying to shade conservative, and I think the best guesses of a number of people at Open Phil were higher than the 12–15% that I landed on. And my own best guess was more like, “Oh maybe 20%, maybe 25%”. And so I think for me, it was sort of numerically an update toward longer timelines, but it also made it seem more real to me and made it seem like there was going to be a lot of stuff in between now and transformative AI. So emotionally, I’m maybe a little bit more freaked out.

Robert Wiblin: Interesting. Okay. So your timelines moved out, so you thought maybe it’ll take a bit longer, but then it felt more concrete and less speculative, and like a real thing that could really happen, and that made you say like, “Oh wow, this actually really matters”.

Ajeya Cotra: Yeah. And it more feels like there’s quite high probabilities of some amount of powerful AI that could have unpredictable consequences, if not fully transformative or human-level AI. So there’s something where I think I was previously thinking of AI as kind of like biosecurity, which is like, we’re doing preparations now for a sudden event that might or might not happen that could change the world in this very discreet way. And that was one flavor of scary. But now I’m thinking of AI much more viscerally, as this onrushing tide. AI is getting better and better. You can certainly do some stuff with AI, maybe it’ll take us all the way to transformative AI, but it’s more relentless and more changing what the world looks like on the way there. And so that’s a different flavor of scary.

Robert Wiblin: Yeah. So I guess you said that this is in a sense an aggressive forecast, but from memory, the big survey or the forecasting survey that they did of ML experts or AI research scientists a couple of years ago, I mean, it had people all the way from 2025 to 2100, I guess, and a few outliers beyond 2100. And so this is pretty consistent with that. It’s just like, well, sometime in the next 100 years, and maybe 30 or 40 years sounds like about the median estimate. So it’s not out of line, I guess, with what some other people have said, although you thought about it a lot more.

Ajeya Cotra: It’s not wildly out of line with one interpretation of that survey… So that survey asks several different questions. In my mind, the headline results of that survey is that the researchers were quite inconsistent with themselves, in terms of.. Asking when AI can do all of the tasks a human can do leads to sooner timelines reported than asking when AI will be able to do AI research. There were many particular tasks for which the timelines were substantially longer than the timeline for all tasks. And so it depends on how you think the researchers would net out if forced to reflect. Like, I believe the median for the all tasks thing was like, maybe it was 2060 or something, like it was 50 years out or it was like a little bit longer than what I’m saying. But you’re right that it’s not a lot longer. But then the medians for some of the other ones were more like 80 years for particular tasks that the researchers’ survey implied would take longer than just the all tasks question.

Robert Wiblin: Yeah. Although your threshold here is transformative AI, which I guess is like a substantially bigger shift than just AI being able to do the tasks that humans can do. Or am I misunderstanding, is this a higher or lower threshold?

Ajeya Cotra: There’s a question of what will happen to the world if we have AI that can do most of the tasks humans can do. And we believe it’ll be wild, and it will lead to the world moving much, much faster than it’s currently moving, and will lead to future technologies being discovered at a much more rapid pace — in part because AIs can be made to run faster in subjective time than humans, and in part because they have other advantages, like they can be arranged to have really perfect memory, no sleep, all that stuff. And so one disconnect is that I don’t think that most of the people who answered that survey are imagining the consequences of AI that can solve most human tasks to be as radical as what we’re imagining. So what we define as transformative AI is like… In Holden’s blog post, he defined it as AI that has at least as profound an impact as the Industrial Revolution.

Ajeya Cotra: And then in my report, I have a more quantified operationalisation that’s roughly, AI that is the primary driver of growth rates. The growth of the world economy is ten times faster than it is in 2020. So in 2020, the world economy is growing at 2–3% per year, and that’s roughly 25–30 years to double the economy. And so if you imagine growing 10 times faster, that’s like two to three years to double the economy. And the reason we’re choosing that threshold — it’s an arbitrary threshold — but the idea is that once we’re at that case, there are probably further speed-ups in the future. And basically history is compressed when growth is really fast. And so once the world is doubling every two to three years, human plans and human timescale actions become dramatically less relevant to the world, and that’s the threshold we’re looking to forecast.

Recent AI developments [01:39:28]

Robert Wiblin: So it feels to me as a casual observer that this field is moving really fast. I guess this year we’ve had GPT-3 and…

Ajeya Cotra: Yeah, AlphaFold.

Robert Wiblin: People are amazed at what they can do, and now we got AlphaFold, yeah. Did it feel a bit like by the time you finished this, that things have moved on a bit, and…

Ajeya Cotra: Yeah.

Robert Wiblin: Two years is so long in the AI world.

Ajeya Cotra: It definitely felt that way, and I definitely questioned whether this general approach of doing these detailed forecasts and careful investigations is sustainable or reasonable. I think I had a bit more of an exaggerated sense of that. As we were writing the report, I had preliminary conclusions in mind in early 2020, and I was working on really getting things right and really nailing things before publishing. And then in the meantime, GPT-3 was published. And I think GPT-3 really caused a major shift in a lot of academics’ views of timelines, or like gut-level views of timelines. Where before, a lot of academics were fairly dismissive of GPT-2, but I saw very little dismissive attitudes towards GPT-3. And I saw a lot of people who just straight up said on Twitter, “I thought AGI was at least 50 years away, but now I think it’s 10 years away”.

Ajeya Cotra: And so a lot of my report is framed as addressing a sceptical academic audience, and convincing them that it’s reasonable to expect substantial probability of crazy AI capabilities soon. And now, at least some chunk of those people — because of GPT-3 and other developments — they feel like it’s belaboring the point, and they want to argue with me from the other side. They now think the timelines are much shorter than I think they are. And I have counter-arguments to their views too, but the whole orientation was addressing an audience that shifted in between when I was writing it and when I got to publish it.

Robert Wiblin: Yeah. I guess it’s slightly satisfying in that? You have some confirmation that maybe you were right, if they’ve already changed their mind.

Ajeya Cotra: I wish I had gotten it out and had a lot of people arguing with me that it can never happen.

Robert Wiblin: Yeah.

Ajeya Cotra: But I think it’s not so much… I feel kind of salty about that or something, but it’s not a big deal. And I ultimately do think it’s not fast enough yet… At least, if you’re trying to be smart about it and trying to be efficient, I think there’s still plenty of room to do analysis like this.

Robert Wiblin: Yeah. How did this shift your attitude towards AI safety research? And I guess in the same vein, do you do anything differently in your life outside of work because of your expectation that AI may advance pretty quickly and really change the world while you’re still alive?

Ajeya Cotra: In terms of how it affects my vision for AI, I think coming into this I was a partisan to the ‘gradual take-off’ view — probably AI is going to be like other technologies in the sense that there’s going to be less powerful AI first, it’s going to proliferate, there’s going to be lots of different things you do with it before you get full-blown human-level AI. And so that was kind of my bias coming in, but I do think that doing this work made me feel more confident about it and feel more like I have some picture of how the gradual take-off would go. And part of it is not so much the particular research that I did in the report, but just spending a year immersed in ML and realising, “Oh yeah, there’s stuff that I think will improve in the next five years based on this technology.”

Ajeya Cotra: And so that was one thing. And then that kind of leads to a proliferation of other expectations. One is like, it makes me think there’ll probably be early failures of AI systems all over the place. You know, systems that are doing pretty important (but not that important) stuff will fail in ways that are kind of analogous to the ways we worry that even more critical, more general systems will fail. I think that we can have an AI safety issue that’s far short of a global catastrophic risk, but gets everyone’s attention on this problem.

Ajeya Cotra: And I think on net that will probably be a really good thing, and will probably relieve a lot of what Open Phil is feeling now, with not having people who share our vision of what the main problems are to fund. But it could also present challenges that I think maybe the EA community hasn’t thought through as much, because we’ve been so used to this world where we’re a very small set of people who care about this.

Ajeya Cotra: I think it might not be possible to play AI safety as an inside baseball game. We might need to think more about how we message to the public. We might need to think more about how maybe it’s actually quite inevitable that this will become a top-tier political issue that has a lot of eyes on it, like climate change.

Robert Wiblin: Yeah. I was going to say it’s going to look a bit more like climate or COVID or something like that.

Robert Wiblin: Okay. Let’s go back to learning about the methodology, and inspecting whether it makes sense. So the title is Forecasting Transformative AI With Biological Anchors. What does it look like to forecast something using biological anchors? What does that mean?

Ajeya Cotra: There’s a history of trying to forecast when we get artificial general intelligence by basically trying to estimate how powerful the human brain is — if you think of it as a computer — and then trying to extrapolate from hardware trends when we’ll get computers that powerful. So this has a long-ish tradition in the scheme of futurism. Ray Kurtzweil and Hans Moravec in the ’80s and ’90s were thinking along these lines. And I think the big flaw in the earlier iterations of this thinking is basically that they weren’t thinking about the effort it would take — either software effort or machine learning training effort — to find the program that you should run on the brain-sized computers once you had the brain-sized computers.

Ajeya Cotra: So I think it led them to estimate timelines that were too aggressive because of this. The Moravec timeline that he estimated in the ’90s was that roughly 2020 would be when we would have human-level AI. Because he said, “Well, this is how powerful I think the human brain is as a computer, and this is when I think computers of that size will become widely available. They’ll be like $1,000, so somebody could just buy them”. And he was actually right about the second half of that. He was right that computers roughly as powerful as he estimated the human brain to be would be available around now. But — and people pointed this out at the time — he wasn’t accounting for, well, actually it took evolution many, many lifetimes of animals to find the human brain, the human mind, the arrangement that needs to occur on this hardware.

Ajeya Cotra: And so in 2015/2016, our technical advisors — particularly Paul Christiano and Dario Amodei — were thinking about how to extend this basic framework of like, think about how powerful a computer would need to be to match the human brain in raw power. But add on top of that, well, how is the search for that going to work? Like if we were to do something like evolution, or if we were to do something like ML training, how much money, how much computation would it take to do that? And then when would that be affordable? Which pushes timelines out relative to what Moravec was thinking about — which was just when the one computer to run the one human brain might be affordable.

Four key probability distributions [01:46:54]

Robert Wiblin: Okay. So at the heart of your model… Estimate four different probability distributions for four different components, and then put them together to estimate the likelihood of us being able to run transformative AI at acceptable cost by any given day. Can you maybe walk us through the four key probability distributions that you need to estimate here?

Ajeya Cotra: Yeah. So one is, let’s say we wanted to train a single ML model that would constitute transformative AI, which I just call a ‘transformative model’. Let’s say we wanted to do that given today’s algorithms. So whatever architectures we can come up with given what we know today — and whatever gradient descent or whatever ML training techniques we could come up with today — how much computation would it take to train a transformative model? And so that’s one piece of it. That produces this wide probability distribution which I call the ‘2020 training compute requirements distribution’. That’s a mouthful. And then basically, conceptually, once you have that, that is a snapshot of one year. But you expect that algorithmic progress is going to cause us to get better at doing any given thing over time. So you expect that if the median computation is X in 2020, then maybe in 2030, it’s gone down 10 times. So now it’s like (0.1)(X). So that’s the second thing, the algorithmic progress piece. It’s just trying to translate from 2020 to a future year.

Robert Wiblin: Okay. So first off you’ve got like how much computational power would we need today, given what we have now to train a transformative AI. And then you’ve also got this other thing of like, over time that is actually declining, because we’re getting better at training ML algorithms, they can do more with less compute. And then so you’ve got an estimate of like, how strong is that effect?

Ajeya Cotra: Yeah.

Robert Wiblin: Okay. And then what’s the third and fourth?

Ajeya Cotra: And then the third and fourth are basically together estimating how much computation a lab, or some government project, or some project trying to train transformative AI would have available to it. So one piece of that is how hardware prices are declining over time. So you can buy more computation with $1 in the future than you can now. And then another piece of that is how investment is increasing over time. So as people sort of see more potential in AI and as the world as a whole grows richer, then the frontier project will be willing to put in more money to attempt to train transformative AI.

Robert Wiblin: I see. So these are kind of the economic ones. So the first one is how quickly do the really good computers get cheap, and then you want to divide that, I guess, by how much are we going to be willing to spend to try to train a transformative model. So is it a trillion dollars, a hundred billion dollars, yeah, how many resources will we throw at this.

Ajeya Cotra: Yeah.

Robert Wiblin: Okay. Let’s go back to the first one, which is how much compute would you need to train a transformative model now. How did you try to estimate that?

Ajeya Cotra: This is where the biological anchor part comes in. So there are a number of different hypotheses about how good our algorithms are today. So one of them is, actually, we will need to sort of replicate the process of natural selection that created humans. And we can think about how expensive it was for evolution to lead to humans, and then think about, are our algorithms better than that or worse than that? And make some adjustments there, and generate that probability distribution. And then on the other end, actually our algorithms are within striking distance of human learning. So like the learning that a baby does as it grows up into a functional adult. And so we can think about how much computation that constitutes, because we have an estimate of how powerful the brain is, and we know how long it takes to grow up to be a functional adult, and then we can think about making adjustments from there. And so those are two anchors, and then there are more complicated anchors that fill out the middle.

Robert Wiblin: Okay. So those are both kind of crazy anchors in a sense. Or they’re both at two quite extreme levels. Because I suppose the evolutionary one assumes that even though we’re designing this process, we can’t do any better than natural selection has done designing the human brain over billions of years, or I guess at least hundreds of millions of years in effect. It was like so many people. Simulating all of their brains for all that time. It seems like surely we can do better than that by quite a decent margin, or you’d hope. And on the other end, we’re just imagining well you could train a model just as quickly as you could teach a baby to do things. Well, that doesn’t seem quite right, because I guess a baby is born with some kind of innate knowledge, or it tends to develop naturally with a whole bunch of…

Ajeya Cotra: From evolution, it has some sort of like… Yeah.

Robert Wiblin: Yeah. It’s got all of that pre-learning that it’s inheriting. And I guess on top of that, we know that babies just learn stuff so much faster than our current ML models. They’re much, much better at generalising from single cases. So you want to be like, this is maybe an upper and a lower bound, and then we want to find something in the middle.

Ajeya Cotra: Yeah. So the lifetime anchor is kind of like, we basically know that exactly the lifetime doesn’t work. We can afford that right now, and like you were saying, we can observe that babies learn much faster than ML models. So the slightly more sophisticated version of the lifetime anchor is that maybe there’s just a constant factor penalty. So maybe if we observe that our best models today, they took 10,000 times as many data points to learn something as a baby does, then maybe that’s just it. Maybe it’s just however long a human takes to learn something times 10,000. Or times 1,000. Or whatever the thing we observe is, which is between 1,000 and 10,000. But I actually believe that that is not the case, and in fact the factor by which ML is worse than a baby grows as the model gets bigger.

Ajeya Cotra: So with our tiny models that are learning really simple things, the factor is smaller, and then for the bigger models learning more complex things, the factor gets bigger. And so the middle hypotheses are based on this empirical observation that training a bigger model — where bigger models can generally do more complicated, interesting things — training a bigger model takes more data. And there’s some scaling relationship between the size of the model and the data it takes. And that’s where most of my mass is. Most of my mass is on hypotheses that try to cash out, “Okay. How big should the model be?” Somewhere in the zone of a human brain, maybe slightly bigger, maybe slightly smaller, and we want to think about where that lands. And then given that, how much data does it take to train?

Robert Wiblin: So I’ve been hearing about these scaling laws a lot recently, but I don’t feel like I have a great grasp on what they are. Maybe you could explain for me and the audience what we need to know about scaling laws?

Ajeya Cotra: Yeah. So the fundamental thing is that if you’re trying to solve some task to some level of proficiency, like let’s say I want to get to at least grandmaster level in chess, then you need to pick a model that is big enough that it’s capable of learning to be at grandmaster level — bigger models are more capable of learning more complex strategies and stuff. So there’s some size of model that’s big enough to learn to be a grandmaster at chess. And then you need to train it for long enough that it realises that potential. And so a model, you can think of it as like a giant collection of numbers, and those numbers store knowledge it has about the task at hand.

Ajeya Cotra: At first, these numbers are all randomly initialised. These are called parameters. And then training sets the numbers to be reasonable values, and then they can be interpreted or they represent for the model its knowledge of how to play chess. So bigger models are capable of representing more sophisticated strategies, they’re capable of remembering more openings, etc. But they also need to see more experience to fill up all of those numbers. So roughly speaking, the amount of experience they need is going to be proportional to the number of numbers they need to fill up. Machine learning theory says that you should expect that to be roughly linear. So if you need to fill a knowledge bank of 10 million numbers, then you need to see like 10 million examples or maybe a hundred million examples. But recent empirical work suggests that it’s actually sub-linear. So it’s more like it scales to the 0.75 or something.

Robert Wiblin: Okay. So people are talking about this in part because recent discoveries have suggested that if you want to make a model that’s twice as sophisticated — in terms of its memory, or how many strategies it thinks about — you only need 1…

Ajeya Cotra: 1.7 or 1.5 times as much data.

Robert Wiblin: Okay. So it’s not quite as data-intensive as we previously thought?

Ajeya Cotra: Yeah.

Robert Wiblin: Okay. Going back to the question of how much compute it would take to train a human-level model with today’s algorithms, is there any intuitive… I guess it’s an answer of a certain number of flops… 10 to the power of something? Is there any intuitive way of communicating what probability distribution you gave it?

Ajeya Cotra: Yeah. I mean, first of all, my colleague Joe Carlsmith recently put out a great detailed report on what evidence we can gather from looking at the brain about how powerful of a computer you’d need to replicate the tasks of the brain, if you somehow found the right software. And so his estimate is that something in the range of 10 to the 13 floating point operations per second to 10 to the 17 floating point operations per second was probably sufficient based on evidence from looking at the brain, with a central estimate of 10 to the 15 flops. So I’m leaning a lot on that. And he goes through several types of lines of evidence you could look at. But my question was kind of like, well, I’m thinking about 2020. This estimate of brain computation would have been the same whether we estimated it in 1960 or like 2050, right? But there’s only one period in time where we’re exactly dead on competing with biological counterparts. Before that will be worse than the biological counterparts, and after that, we’ll probably be better than the biological counterparts.

Robert Wiblin: And by we, you mean the machines we make?

Ajeya Cotra: Yeah. Our design versus evolution’s design. So we’re not thinking about how hard it was for us or evolution, but just like…once we design a product, how much better or worse is it than the product evolution designed that’s analogous? So there, I was just kind of trying to subjectively gauge this by looking at machine learning models we have today that are attempting to do something analogous to what creatures do. The one that we have the richest amount of data for is vision. And then there’s also motor control and how sophisticated are their movements. And then there’s a whole other line of evidence that is actually easier to think about, which is just, forget about machine learning and think about other technology and how it compares to other natural counterparts.

Ajeya Cotra: So think about cameras and how they compare to eyes, or think about leaves and how they compare to solar panels, stuff like that. So looking at all that stuff, it seems like roughly, in the current moment, the artifacts we design are somewhat worse than the artifacts that evolution designs. And so I estimated that it would be about ten times larger than Joe’s estimate for the human brain.

Robert Wiblin: Wouldn’t that stuff just be all over the place, because you’ve got satellites that can see people from space and my eye can’t do that. And I guess if you were like, “How good is Rob’s arm compared to this truck that can pick up an insane amount of weight?” I mean, from one point of view, maybe the truck isn’t as nimble as I am, but from another point of view, it can pick up much more than I can, so it just seems it’s going to vary a lot depending on the measure.

Ajeya Cotra: Yeah I mean and on the other side, we have these cells that are like crazy nano-machines and we have nothing like that.

Robert Wiblin: Yeah. Right. I think, what is it… Photovoltaic cells are actually more efficient than leaves, but they don’t reproduce themselves from a seed. And so from one point of view, they’re much more impressive.

Ajeya Cotra: Yeah. Right. So the basic answer is that you… It is definitely really fuzzy, and you’ll have people who will make the argument you made, that human technology is way cooler than nature. And you’ll have people who make the argument I made, that we’re not anywhere in the ballpark. The way I try to think about it is like, you want to look in those places where humans and nature have similar cost functions. So it actually mattered to them to get the benefit, and it actually mattered to them to avoid the cost. So often in this case the cost is energy to run the thing. Because that is something humans care about because it costs money, and it’s also something nature cares about because you need the animal to be better at finding food in order to sustain a more energy-intensive thing.

Ajeya Cotra: So long-term transportation and super heavy lifting is not something that conferred much of a fitness benefit on anything. And so it’s unfair to grade nature on that basis. I think it’s actually a little more fair to grade us for not getting nanotech, but it’s still potentially kind of like, we weren’t in the game, so to speak. So like there’s something where it’s like technology that humans are in the game of making, that nature had a strong incentive to make because there was a strong fitness incentive to be good at it. Which I don’t think is the case for long-distance travel and trucking and stuff.

Robert Wiblin: Okay. So you’re saying us as engineers, we take a big hit because nature has managed to produce self-replicating tiny machines and we haven’t done that yet. It doesn’t even seem like we’re that close. And so at that we’re quite a bit worse.

Ajeya Cotra: Yeah. We are better at these other things, like weaponry, right? I think it would confer a fitness advantage to be as good at killing things as human machines are. But I mostly try to focus on the regime where… Because I feel like the machine learning models are in the regime where they’re trying to do the same things and not totally failing at it, I’m mostly looking at other technology that also seems to be in that regime.

Robert Wiblin: I see.

Ajeya Cotra: So not thinking that much about nanotech.

Robert Wiblin: Yeah, I see. Okay. So it’s cameras and eyes and photovoltaic cells and leaves and things like that.

Ajeya Cotra: Yeah.

Robert Wiblin: Okay, let’s push on to the second component, which is how quickly our algorithms are getting better at learning and running with less compute over time. How do you estimate that, very broadly speaking? What was the conclusion?

Ajeya Cotra: Yeah, so broadly speaking, there are basically two papers I’ve seen on this. Really pretty small literature. One is Katja Grace’s paper from 2013 called Algorithmic Progress in Six Domains, and then the other is a recent blog post from OpenAI called AI and Efficiency. Both of those are basically doing a pretty similar thing for different tasks, which is they’re asking, over successive years, when there’s some benchmark, how much less computation did it take in future years to beat the previous state of the art?

Ajeya Cotra: And so from that, they found these trendlines. Where I believe like the trendline from AI inefficiency was halving every 16 months — or maybe 13 months — the amount of computation it takes to achieve the performance on ImageNet that was achieved in 2012. So in 2012 there was this big breakthrough year where image recognition models got a lot better, and from there we can ask how much more efficiently we can get that same level of performance.

Ajeya Cotra: And then Katja Grace’s paper looked at a lot more different tasks, but few of them were machine learning tasks. She looked at factoring numbers, and there were a bunch of other things. In that paper, it was kind of between 12 months and two years or something, two and a half years.

Robert Wiblin: …the halving time?

Ajeya Cotra: The halving time, yeah. And so I was just kind of like okay, let’s take that as a starting point. And then I shaded upward a little bit because with transformative models, for all these tasks, you achieve a benchmark and then you work on making it better. You have really strong feedback loops. But with transformative AI there’s something that is currently at an unachievable level of cost, but slowly coming down until we can meet it. Halving time for that would be slightly longer, because you’re working on these proxy tasks. And what you can actually work on and improve are benchmarks that you’ve already seen. But you expect there’s some translation between that and the ability to train a transformative model.

Robert Wiblin: Yeah. Okay, so I guess this one shouldn’t be massively uncertain, at least in the past, because it’s kind of measurable in theory. Maybe some of the uncertainty comes in because these trends keep continuing, until they don’t. So, projecting it forward is… At some point presumably we’ll level off, and we’ll have the best algorithm possible, and you can’t get any better than that, so it’d be stuck, but we have no idea where that point is going to be, or how fast we’ll get there.

Ajeya Cotra: And I do assume a leveling off, but it’s pretty arbitrary where I put it.

Robert Wiblin: Yeah. Okay, that makes sense. Let’s move onto the next one, which I guess is probably going to be pretty familiar to everyone, because it’s kind of projecting Moore’s Law, or I guess we used to have Moore’s Law, where it’s chips got … What was it? They halved in price every year or two or something? And recently things haven’t been going as fast as that, but basically you’re just trying to project that issue of the cost of an equivalently good computer chip over time.

Ajeya Cotra: Yeah, and there I think the biggest uncertainty — which I haven’t looked into as much as I’d like — is not so much the speed of the halving, but where it cuts off. I think there’s a lot of work one could do in terms of thinking about the physical limits of different types of computing, and what is actually the best physically realisable energy efficiency of computation. I instead did an outside view thing where I was like, “Well, over the last 100 years… Or over the last 70 years, we had 12 orders of magnitude of progress. Maybe over the next 80 years we’ll have half of that in log space”. I just assumed we would have six orders of magnitude of progress. It’s not very thoroughly done.

Robert Wiblin: Okay. Remind me, we had Moore’s Law-ish, roughly being followed for something like 40 years, and then the last 10 years things have been noticeably slower than that? Is that because we were kind of approaching the limit of what we can do, given the current paradigm, given the current materials that we’re using?

Ajeya Cotra: Yeah, my understanding is that we’re getting to the point where it’s hard to make the transistors in the chips smaller without leading to issues with overheating, and potentially even issues with… They’re only a few atoms big at that point. There might be quantum effects that I don’t understand very well.

Robert Wiblin: I see, okay. What did you project forward? Did you project forward this kind of slow progress that we’ve had now, or do you think at some point it will speed up because we’ll come up with a different method?

Ajeya Cotra: Yeah, so the most recent thing, like the jump from the previous state-of-the-art machine learning chip to the most recent state-of-the-art machine learning chip — the previous was the V100 and the current is the A100 — seemed to be bigger than the slightly less recent 10 year trend you were talking about. And then I also think there’s room to change substrates, like there’s a lot of activity in the startup world, researching optical computing — where you basically use light and a bunch of tiny mirrors instead of silicon chips to transfer information. So, it’s not going over a wire, it’s just like being bounced off a mirror so it can go faster potentially.

Robert Wiblin: Wow, that’s going to be huge for the tiny mirror industry.

Ajeya Cotra: Yeah.

Robert Wiblin: Presently not a very large industry, I think. Sorry, go on.

Ajeya Cotra: I just split the difference and I was like, “Oh well, it’ll be kind of slower. Moore’s Law was one to two years, the recent trend is three to four years”… I just said it would be two and a half years halving time.

Robert Wiblin: Alright. And the last one, which probably I haven’t thought about that much, and most people wouldn’t have thought about that much, is how much is society willing to pay to train this amazing breakthrough AI model? Which I guess… Currently I don’t know, how much is Google or the government or whatever the best research projects are… How much are they paying for those models? And I guess presumably over time it’s been going up, as they seem to have more applications.

Ajeya Cotra: Yeah. So, in terms of publicly calculable information, the most expensive two models, which are in a similar range to each other, are GPT-3 and AlphaStar, which is the StarCraft model. AlphaFold might be more, I haven’t looked at AlphaFold, but both GPT-3 and AlphaStar are in the 1–10 million range of computation to train. And then there’s more stuff that you need to do, like training runs you do to tinker with stuff before the final training run, and the cost of data and the cost of labor, and stuff I’m not counting here. But just purely from computation, if you sort of math out how much computation it probably took to train them based on how big they are, it’s in the 1–10 million range.

Robert Wiblin: It’s so little!

Ajeya Cotra: Yeah, it’s really small.

Robert Wiblin: Breakthroughs in biomedicine, like you were saying earlier, can cost billions. I guess when you consider personnel and materials and so on. And this is so… I guess maybe most of the cost isn’t in the compute, it’s in the people, or something. But even so, it’s surprisingly small. We could spend much more than that.

Ajeya Cotra: Yeah. Right now it’s not the case that compute is most of the cost of these things. But I think probably eventually it will be the case that compute is the dominant cost, or at least like half. You might as well go to half, because you’re not even increasing the cost of your project that much. So, my basic forecast is that… So, OpenAI has another blog post called AI and Compute, in which they show that the super recent trend of scaling up the computation of state-of-the-art results is doubling every six months or something, or every nine months. Don’t quote me on that. I think it’s six months.

Ajeya Cotra: So I basically assumed that that trend would hold until models were big enough that people would start to demand real economic value from them, before scaling up more. And then from there, I expected a slower trend to hold, that kind of converged to some fraction of gross domestic product. I was kind of like… Big mega projects in the past tend to be projects that last anywhere from three to five years, and then over the course of that project they spend a couple percent of GDP.

Ajeya Cotra: So, I was assuming that a project to train a transformative model would be similar in those economics in the long run, and maybe compute would be half or a third of the total cost of the project. I assumed 1% of GDP eventually. But there’s like, basically I’m imagining a super fast initial ramp-up that lasts just a couple more years, to get up to a billion, 10 billion, and then like much slower ramp-up…or somewhat slower ramp-up, that’s more like doubling every two years, from there, to get up to 1% of GDP, and then following GDP from there.

Robert Wiblin: Okay. So, things will kind of follow what seems to be the current trend, is your prediction, roughly. Until, say, we get to something like… Now we’re talking about the Manhattan Project, or the Apollo program, or the Three Gorges Dam, or something that’s really actually material — at which point it’s got to demonstrate economic value, or there’s just no one willing to fund this thing. And at that point maybe the growth levels off, at least until the thing actually works enough to increase GDP, such that it can pay for itself.

Ajeya Cotra: Right, yeah.

Robert Wiblin: Alright. So we’ve got the four different pieces. Then I guess you’re multiplying and dividing them together to get some overall thing.

Ajeya Cotra: Doing some math.

Robert Wiblin: Doing some math. I guess you do a Monte Carlo thing? To sample from each of these distributions on each one, and then produce a final distribution?

Ajeya Cotra: Yeah, so I actually, I think that would be the right thing to do, but the only thing that I’m actually modeling out a full distribution of uncertainty for is the first one, which is the biggest piece, the 2020 training computation requirements. And then the other ones I have point estimates or like point forecasts, and then I just do an aggressive, and a conservative, and a best guess for those. But the sort of proper thing to do would be to have, to model out the uncertainty there, too.

Robert Wiblin: Okay. It seems like given that you’re reasonably unsure about all of these, or there’s reasonable uncertainty balance, that the conclusion would be massively uncertain — because you’re compounding the uncertainty at each stage. And yet, interestingly, the actual range of time estimate… It’s not from now until 10,000 years. The amount of time isn’t that wide, and I guess that’s because some things are increasing exponentially, so eventually it catches up within a reasonable time period, even in the pessimistic case? Is that what’s going on?

Ajeya Cotra: Yeah. I mean, so the computation requirements range is… It’s very wide in one sense. It spans 20 orders of magnitude. But it’s also not wide in another sense, in the sense that we have spanned more than that range, or at least almost that range, over the history of computing so far. The real work is done by… Most of my probability mass is on something within the biological anchors, as opposed to something astronomically larger than that. That’s where the work is coming from.

Ajeya Cotra: It’s like, 20 orders of magnitude is a huge range, but between exponentially improving algorithms, and exponentially increasing spending, and exponentially decreasing hardware costs, you can shoot through that range over a matter of decades, as opposed to centuries.

Robert Wiblin: That makes sense. I guess I saw, I think at one point you added in a kludgy solution where you’re like, “And there’s a 10% chance that I’m completely wrong about this, and in fact we don’t get anywhere near now”.

Ajeya Cotra: Yeah, I do have that. Yeah.

Robert Wiblin: I guess it makes sense. Maybe we’re just totally misunderstanding how the brain works, and we’ve got this all wrong, and…

Ajeya Cotra: Yeah. Maybe it’s like nanotech, right? Where like all these other technologies it seems like you can talk quantitatively about how much worse human technology is than natural technology, but there’s some stuff like nanotech where we’re just like… It feels kind of silly to talk quantitatively about that, it just feels like we’re not there.

Most likely ways to be wrong [02:11:43]

Robert Wiblin: Yeah. So, imagining that you’ll look back in the future and think that this report got things pretty wrong, what’s the most likely way that you would have gotten it pretty wrong in either direction?

Ajeya Cotra: Yeah, so I mean I think there’s more room to get it wrong, in some sense, in the direction that AI is like nowhere near where I think it is. I assigned substantial probability pretty soon, so I could be off by a factor of two in that direction and that would be scary and a big deal. But sort of like, why do I think it’s at all in this range, and could I be wrong there… I think a classic response is just, the move where you ask the question, given 2020 algorithms, how much compute would it take… Even if you’re allowing for it to be way more than we can afford, that’s just not a fruitful intellectual move to make, and the answer is just astronomical, and there’s no reason to think it’s near the biological anchors. There’s no reason to think our algorithms are as efficient as evolution.

Ajeya Cotra: And I don’t think that it makes sense to assign more than 50% probability to that claim, the claim that it’s just nowhere in the range. Mainly because the range is large enough that I don’t really trust most people’s intuitions — even experts’ intuitions — about what would definitely not be possible if we had 15 orders of magnitude more computation. That’s not the kind of question that AI experts are trained to have expertise on. I have an intellectual style where I want to lean much more on an outside view, like the biological anchors view, or some of the other views that Open Phil researchers have been looking into, like just ignorance priors, once we’re talking about that kind of range.

Ajeya Cotra: One thing that I didn’t get into when I was explaining the center of the probability distribution, which is, like I said before, it’s sort of assuming there’s a model of a certain size, it’s somewhat bigger than the brain, that would be enough to be transformative. And you need some amount of data to train it. And then we have these scaling laws that say the number of samples or data points scales almost linearly but sort of sub-linearly with model size.

Ajeya Cotra: From there, there’s a big unfixed variable, which is, what counts as one data point, or one sample? Like, are we talking about… Is one sample just seeing a single word, like GPT-3 is trying to predict one word and it gets feedback about whether it predicted that word correctly? Or like one image… the ImageNet model tries to predict an image, then gets the right answer and updates on that? Or is it more like a game of StarCraft, where the StarCraft model plays this whole game, and then it finds out whether it won, and then that propagates backwards and updates it?

Ajeya Cotra: You could imagine games that you’re training the model on that take much longer than StarCraft to resolve whether you’ve won or lost. That general concept is what I’m calling the ‘effective horizon length’, where currently you have models, I mentioned earlier, that AlphaStar and GPT-3 both cost about the same amount to train. But AlphaStar is much, much smaller than GPT-3. AlphaStar is 3,000 times smaller than GPT-3. That was because, in part it was because each data point for AlphaStar is a game, rather than a word.

Ajeya Cotra: So there’s a big question mark about how dense can we get the feedback to train a transformative model? Will we be able to get away with giving it feedback once every minute? And that’s actually rich, useful feedback… Or are there some things that just need to play out over a longer period of time before we can tell whether a certain direction of change is good or not?

Robert Wiblin: Unfortunately we’ve got to move on, but I guess this report is online for people who are interested to learn more. We’ll stick up a link to it, and maybe also a presentation that explains what we just went through in a bit more detail, and has nice graphs. I guess the bottom line for me is that having looked into all of this, you’re pretty unsure when transformative AI might come, but you think it could be soon, could be a medium amount of time away, and I guess you haven’t found any reason to think that there won’t be a transformative AI within the time period over which we could plausibly plan for thinking about how that might affect us and how it might go better or worse.

Ajeya Cotra: Yeah. I mean, I think I would frame it as like 12–15% by 2036, which is kind of the original question, a median of 2055, and then 70–80% chance this century. That’s how I would put the bottom line.

Biggest challenges with writing big reports [02:17:09]

Robert Wiblin: Alright. So working on big open-ended reports like this can be a bit of a mess, and I think difficult for people intellectually — and I guess also psychologically.

Ajeya Cotra: Yeah.

Robert Wiblin: What are the biggest challenges with this work? How do you think that you almost got tripped up, or that other people tend to get tripped up?

Ajeya Cotra: One thing that’s really tough is that academic fields that have been around for a while have an intuition or an aesthetic that they pass on to new members about, what’s a unit of publishable work? It’s sometimes called a ‘publon’. What kind of result is big enough? What kind of argument is compelling enough and complete enough that you can package it into a paper and publish it? And I think with the work that we’re trying to do — partly because it’s new, and partly because of the nature of the work itself — it’s much less clear what a publishable unit is, or when you’re done. And you almost always find yourself in a situation where there’s a lot more research you could do than you assumed naively, going in. And it’s not always a bad thing.

Ajeya Cotra: It’s not always you’re being inefficient or you’re going down rabbit holes, if you choose to do that research and just end up doing a much bigger project than you thought you were going to do. I think this was the case with all of the timelines work that we did at Open Phil. My report and then other reports. It was always the case that we came in, we thought, I thought I would do a more simple evaluation of arguments made by our technical advisors, but then complications came up. And then it just became a much longer project. And I don’t regret most of that. So it’s not as simple as saying, just really force yourself to guess at the outset how much time you want to spend on it and just spend that time. But at the same time, there definitely are rabbit holes, and there definitely are things you can do that eat up a bunch of time without giving you much epistemic value. So standards for that seemed like a big, difficult issue with this work.

Robert Wiblin: Okay. So yes. So this question of what’s the publishable unit and what rabbit holes should you go down? Are there any other ways things can go wrong that stand out, or mistakes that you potentially made at some point?

Ajeya Cotra: Yeah. Looking back, I think I did a lot of what I think of as defensive writing, where basically there were a bunch of things I knew about the subject that were definitely true, and I could explain them nicely, and they lean on math and stuff, but those things were only peripherally relevant to the central point I wanted to make. And then there were a bunch of other things that were hard and messy, and mostly intuitions I had, and I didn’t know how to formalise them, but they were doing most of the real work. One big example is that of the four things we talked about, the most important one by far is the 2020 computation requirements. How much computation would it take to train a transformative model if we had to do it today. But it was also the most nebulous and least defensible.

Ajeya Cotra: So I found myself wanting to spend more time on hardware forecasting, where I could say stuff that didn’t sound stupid. And so as I sat down to write the big report, after I had an internal draft… I had an internal draft all the way back in November 2019. And then I sat down to write the publishable draft and I was like, okay, I’ll clean up this internal draft. But I just found myself being pulled to writing certain things, knowing that fancy ML people would read this. I found myself being pulled to just demonstrating that I knew stuff. And so I would just be like… I’d write ten pages on machine learning theory that were perfectly reasonable intros to machine learning theory, but actually this horizon length question was the real crux, and it was messy and not found in any textbook. And so I had to do a lot to curb my instinct to defensive writing, and my instinct to put stuff in there just because I wanted to dilute the crazy speculative stuff with a lot of facts, and show people that I knew what I was talking about.

Robert Wiblin: Yeah. That’s understandable. How did the work affect you personally, from a happiness or job satisfaction or mental health point of view? Because I think sometimes people throw themselves against the problems like this and I think it causes them to feel very anxious, because they don’t know whether they’re doing a good job or a bad job, or they don’t feel they are making progress, or they feel depressed because they worry that they haven’t figured it out yet and they feel bad about that.

Ajeya Cotra: Yeah. I had a lot of those emotions. I think the most fun part of the project was the beginning parts, where my audience was mostly myself and Holden. And I was reading these arguments that our technical advisors made and basically just finding issues with them, and explaining what I learned. And that’s just a very fun way to be… You have something you can bite onto, and react to, and then you’re pulling stuff out of it and restating it and finding issues with it. It’s much more rewarding for me than looking at a blank page and no longer writing something in response to somebody else. You have to just lay it all out for somebody who has no idea what you’re talking about. And so I was starting writing this final draft — the draft that eventually became the thing posted on LessWrong — in January of 2020.

Ajeya Cotra: And I gave myself a deadline of March 9th to write it all. And in fact, I spent most of January and half of February really stressed out about how I would even frame the model. And a lot of the stuff we were talking about, about these four parts, and then the first part is if we had to do it today, how much computation would it take to train… All of that came out of this angsty phase, where before I was just like, how much computation does it take to train TAI, and when will we get that? But that had this important conceptual flaw that I ended up spending a lot of time on, which is like, no, that number is different in different years, because of algorithmic progress.

Ajeya Cotra: And so I was trying to force myself to just write down what I thought I knew, but I had a long period of being like this is bad. People will look at this, and if they’re exacting, rigorous people, they’ll be like this doesn’t make sense, there’s no such thing as the amount of computation to train a transformative model. And I was very hung up on that stuff. And I think sometimes it’s great to be hung up on that stuff, and in particular, I think my report is stronger because I was hung up on that particular thing. But sometimes you’re killing yourself over something where you should just say, “This is a vague, fuzzy notion, but you know what I mean”. And it’s just so hard to figure out when to do one versus the other.

Robert Wiblin: Yeah. I think knowing this problem — where often the most important things can’t be rigorously justified, and you just have to state your honest opinion, all things considered, given everything you know about the world and your general intuitions, that’s the best you can do. And trying to do something else is just a fake science thing where you’re going through the motions of defending yourself against critics.

Ajeya Cotra: Yeah. Like physics envy.

Robert Wiblin: Yeah. Right. I think…

Ajeya Cotra: I had a lot of physics envy.

Robert Wiblin: Yeah. I’m just more indignant about that now. I’m just like, look, I think this, you don’t necessarily have to agree with me, but I’m just going to give you my number, and I’m not going to feel bad about it at all. And I won’t feel bad if you don’t agree, because this unfortunately is the state-of-the-art process that we have for estimating, is just to say what we think. Sometimes you can do better, but sometimes you really are pretty stuck.

Ajeya Cotra: Yeah. And I think just learning the difference is really hard. Because I do think this report, I believe has made some progress toward justifying things that were previously just intuitions we stated. But then there were many things where I hoped to do that, but I had to give up. I think also, doing a report that is trying to get to a number on an important decision-relevant question is a ton of pressure, because you can be really good at laying out the arguments and finding all the considerations and stuff, but your brain might not be weighing them right. And how you weigh them, the alchemy going on in your head when you assign weights to lifetime versus evolution versus things in between make a huge difference to the final number.

Ajeya Cotra: And if you feel like your job is to get the right number, that can be really, really scary and stressful. So I’ve tried to reframe it as my job is to lay out the arguments and make a model that makes sense. How the inputs get turned into outputs makes sense and is clear to people. And so the next person who wants to come up with their views on timelines doesn’t have to do all the work I did, but they still need to put in their numbers. My job is not to get the ultimate right numbers. I think reframing it that way was really important for my mental health.

Robert Wiblin: Yeah. Because that’s something you actually have a decent shot at having control over, whether you succeed at that. Whereas being able to produce the right number is to a much greater degree out of your hands.

Last dollar project [02:25:28]

Robert Wiblin: Alright. Another part of this… I guess this is a big mood shift here, but another part of the project I’m trying to figure out, how Open Phil should disburse its money, is trying to think well, should it be giving away more now, or should we be holding onto our money to give it away at some future time when perhaps we’ll have other opportunities that could be better or worse? I guess you guys call this the ‘last dollar’ project. Can you tell us a bit about that?

Ajeya Cotra: Yeah. Basically the idea is that if there’s diminishing returns to the money we’re giving away, then the last dollar that we give away should be, in expectation, the least valuable dollar. And furthermore, if we give away one dollar today, the thing it’s trading off against, the opportunity cost, is whatever we would have spent the last dollar on. The theoretically clean answer is that if we know the value of our last dollar, then we should be giving to everything we find that’s more cost effective than that.

Ajeya Cotra: Like, every year we should look around and fund everything that seems better than our last dollar, and hold onto all the rest. That’s the conceptual answer to both giving now versus giving later, and what should we spend our money on. So, the goal… There’s two sides to this. One is trying to think about what might we spend our last dollar on. How good is that? Then the other is trying to think about what should we expect about how many opportunities there are that are better than the last dollar in each future year?

Ajeya Cotra: We’re trying to quantify it. Instead of asking “Is giving now better or worse than giving later”, we’re trying to get a rough sense of the allocation across time that we should expect will be reasonable. So, on the last dollar question, I’ve done some work on that on the longtermist side, in terms of what do we spend the last dollar on. For the near-termist side, my colleague Peter is working on putting together a model of allocation over time for the near-termist side. They have a bit better sense of what their last dollar is than we do, because they have… GiveWell has been working for more than a decade on mapping out these global health interventions that have massive room for more funding, and really high cost effectiveness. The near-termist side is mostly trying to beat the benchmark set by GiveWell, and assume that the things like GiveWell top charities will be able to absorb marginal money. Peter has been working on this model that is basically a more complex variant of Phil Trammell’s model. Phil Trammell recently put out this paper called Patient Philanthropy or something I think.

Ajeya Cotra: Peter’s working on a more complex variant of this model that’s going to give a rough guide to how the near-termist side should spend down its money. Basically the way the model works is on the one side, your money is growing because you’re putting it in the market and it’s getting some percent return per year. And then on the other hand, opportunities to do good we assume are declining over time, because other funders are coming in and funding those things, or the world is generally getting better and problems are getting solved on their own. The baseline rate of disease is going down, stuff like that.

Ajeya Cotra: These are the two main forces and then there are a bunch of other forces you could also model. Like, maybe giving now helps you learn how to give better, and each year has diminishing returns because you can’t give an organisation $100 million in one year just because you could have given it $10 million over each of ten years. Stuff like that. So, this model is trying to net all of this out and then come up with an expected allocation across time. And basically because all of the individual parameters are constant growth or decay — so there’s a percent that your money is growing every year, and then there’s like a percent that you’re assuming opportunities are declining every year — this spits out basically that you should give a constant fraction of your money every year, you should expect to give out a constant fraction. And that constant fraction could be zero, because you should save indefinitely if the parameters shake out that way, or it could be very high, such that you’re essentially trying to spend down as fast as possible. Or it could be somewhere in between, where you’re not drawing down your principal, but you’re giving away some of the interest. Those are the three regimes.

Robert Wiblin: Okay, so just to check that I’ve got that straight: Because GiveWell — or those who are focused on human charity today — have a better sense of what their future giving opportunities will be and how effective they’ll be, you’re using that as a baseline to compare to for the other programs, saying, “Will we have better opportunities within this program than the near-term human thing?” I guess, what was the time frame you were thinking about?

Ajeya Cotra: Our main funders, Cari and Dustin, want to spend down most of their fortune within their lifetime. So we’re usually thinking of a several-decade time span, like 50 years or 100 years, experimenting with different… Like what happens when you set the deadline in different places.

Robert Wiblin: I see. Then you add in a bunch of other things, like discount rate, and return on investment, and I suppose learning effects and things like that, that happen every year. And then build this into a model that says how much you should give away. How much of the decision comes down to just this really difficult, empirical question, where like the factory farming program and the biosecurity folks have to say, “Well, how good will the opportunities in this area be in 40 years time?” Which seems really hard to answer.

Ajeya Cotra: Yeah. I don’t know as much about the animal-inclusive near-termist worldview, but roughly speaking I would say the human-centric near-termist worldview gets more juice out of doing a model like this, that recommends some percent you should expect to give away each year. And they still have to do a bunch of empirical work to find the opportunities that meet that bar, but they’re able to look ahead and be like, “Okay, we need to prepare to give X% because our model says that should be roughly optimal based on what we believe.”

Ajeya Cotra: We still need to empirically find those opportunities, and we might end up finding more or less than that, but it’s kind of a guideline. On the longtermist side, it’s a lot trickier, and we don’t expect it to look like the optimum is some constant fraction given away every year. We kind of expect it to more track the shape of existential risk. So, for example, on AI risk, we expect there to be more opportunities roughly like five or 10 years before transformative AI than either before or after. Before, you less know the shape of how things are going to go, and there are fewer people who want to work on this that you could fund, and impacts could be washed out by things that happened in the future. And then after that point, it might be that there’s a ton of money in the space, and so your dollars are less leveraged. So we’re imagining there’s some window before transformative AI where we’ll be spending a lot more than we’d be spending before or after that window.

Robert Wiblin: Yeah. It seems like you build this big model, but I suppose Cari and Dustin, they want to spend all the money before they die, and I guess assuming they live a normal human lifespan, we’ve got maybe 50 years to play with or something like that. In a sense there’s not that much flexibility, and you also have to think about how quickly could you plausibly scale your ability to find really good grant opportunities? That’s also slightly a bottleneck at the moment. In fact, how much does this influence your decisions? Are you in practice bound by other constraints that are doing most of the work?

Ajeya Cotra: Yeah, so I think again, it’s a little bit different on the near-termist versus the longtermist side. On the longtermist side, we’re not building out much of an allocation-over-time model. We’re mostly just focused on the question of how good is the last dollar. We’re in a regime where we are finding substantially fewer giving opportunities than we would like to be, because these fields are small. So, we want to be funding basically everything that’s better than that last dollar.

Ajeya Cotra: And then on the near-termist side, I think like I said before, there’s more guidance. The model that Peter is working on, unlike Phil Trammell’s model, allows for you to set deadlines, say if you have a separate constraint where you really want to have given away the money by date X. That changes recommendations somewhat, like you would imagine. But there’s still, depending on how the other parameters are set, besides the deadline, there’s a range you could end up with in terms of how much you should be roughly aiming to give away per year.

Robert Wiblin: Yeah. I guess this is so important that it’s something Open Phil’s going to keep tinkering with probably as long as it exists. But is there any bottom line of what fraction of the total principal that you have now you would want to be giving away each year, all things considered?

Ajeya Cotra: I don’t know because I’m not working on the near-termist side of the project. I’d rather not speak for them, but hopefully you can grab someone from that team on your podcast pretty soon. I can talk about the last dollar thinking on the longtermist side, which is just like how good is this last dollar, not so much allocation over time, exactly.

Ajeya Cotra: There are basically two projects: one that I worked on a while back, and one that I’m just starting to work on now. The one that I worked on a while back, basically we wanted to seek something that could be an intervention where we could robustly spend a lot of money. On the near-termist side, the analogy here is GiveDirectly. GiveDirectly is very hugely scalable — almost unboundedly scalable, in the regime of the money Open Phil has access to. We expect there to be roughly linear returns, because as you’re giving cash transfers to extremely poor people, there’s such a large number of extremely poor people at roughly the same level of income, that you’re not really hitting diminishing returns until you’re giving away substantially more than Open Phil could afford to give away anyway.

Ajeya Cotra: That’s the lower bound of the near-termist last dollar. Because it’s one thing that we could put all of the money into, and we would expect all of it to get roughly 100x return, because the individuals we’re giving to are roughly 100 times poorer than the average American. So, we were seeking something that we thought that would be like the GiveDirectly of longtermism. One big intervention that we can spend a lot of money on, that has roughly linear returns in that regime.

Ajeya Cotra: The goal here was just to demonstrate that if we really wanted to, the longtermist worldview could find things to spend money on — because that is a big question, in terms of the longtermist worldview — and that it would be a reasonable return on investment. And that we think we could do better than this, but this ‘GiveDirectly of longtermism’ is at least reasonable. That was the goal. We turned to biosecurity for this, because basically bioscience is expensive and biotech is expensive, and there’s a pretty big field of people we could potentially co-opt to do things that we think are valuable on longtermist grounds.

Ajeya Cotra: There’s not so much in AI, and other causes are less expensive to just do things in because they’re more thinking about stuff. And so what we ended up landing on was this notion of funding, like meta R&D, to help make responses to new pathogens like COVID faster. Currently it takes several months to a year — once we learn of a new pathogen — to develop either a vaccine or an antiviral and then distribute it. There are a bunch of things we could potentially do to reduce that.

Robert Wiblin: One other suggestion I’ve heard as a baseline longtermist intervention is reducing carbon emissions through carbon offsetting, or subsidising the scale of solar energy, or something like that. Did you consider that?

Ajeya Cotra: We looked into climate change from a longtermist perspective a while back. We didn’t do a super deep investigation, but like Toby Ord says in his recent book The Precipice, we felt that it was substantially less good on a longtermist perspective than the big two that we focus on, which is AI and biosecurity. We wanted a lower bound here but we also wanted to… We basically wanted to find the biggest thing that was still low uncertainty in terms of our ability to spend the money. Because the big questions on AI are like, do we even find anything to give the money to, given that the field is so small?

Robert Wiblin: Yeah, okay. Carry on.

Ajeya Cotra: So, we were looking at meta R&D to make responses to new pathogens faster. So currently it takes several months. We thought spending a lot of money on a lot of different fronts could bring it down to a month or a few weeks, in terms of the calendar time from ‘new virus comes on the scene’ to ‘we have a vaccine or an antiviral, and we’re starting to roll it out’. So there are a bunch of things we could potentially do. One thing is just like funding people to develop and stockpile broad-spectrum vaccines, which are vaccines that are trying to target a biological mechanism that’s common to a big family of viruses.

Ajeya Cotra: Potentially if we found a broad-spectrum flu vaccine, say, that vaccine could protect people against a much more dangerous engineered version of the flu, too. Because it’s targeting mechanisms that are fundamental to all the different flus. And you could imagine funding tools for rapid onsite detection, like as soon as one person gets the virus, you can just go onsite, you can sequence it really quickly, you can map out its protein structure, maybe with something like AlphaFold, and then you come up with guesses about what molecules bind well to the molecules on the pathogen, so you cut out some of the trial and error in drug development.

Ajeya Cotra: So it was stuff like this that we were imagining funding, and there’s a lot we could potentially fund, especially with the manufacture and stockpile of vaccines. This is why we were hoping it could be kind of like a ‘GiveDirectly of longtermism’ type thing. With that, the basic structure of the cost-effectiveness estimate was like, we expect that eventually — as civilisation becomes more technologically mature — we will have this ability to rapidly detect and prevent diseases. But we’re moving that forward in time by funding the field and beefing it up earlier. We can move forward in time by several months or a year the point at which we have the mature technology, as opposed to the current technology.

Ajeya Cotra: And then we would basically cut into some chunk of the x-risk in that window that we moved it forward. If in 2041 we would have this ability, and we caused it so that it happened in 2040 instead, then there’s some fraction of like one year’s worth of bio x-risk that’s reduced.

Robert Wiblin: Okay. That makes sense. So what kind of conclusions has this led to, if anything, about whether you should spend down the resources faster or slower? Making this comparison I guess to funding this meta science decades in the future.

Ajeya Cotra: Yeah, so this estimate is roughly $200 trillion per world saved, in expectation. So, it’s actually like billions of dollars for some small fraction of the world saved, and dividing that out gets you to $200 trillion per world saved. This is quite good in the scheme of things, because it’s like less than two years’ worth of gross world product. It’s like everyone in the world working together on this one problem for like 18 months, to save the world. That’s quite good in some sort of cosmic sense, right? Because it would be worth decades of gross world product to save the world, potentially.

Ajeya Cotra: But we were aiming for this to be conservative, because it’s likely we would spread across multiple longtermist focus areas, instead of just biosecurity, and AI risk is something that we think has a currently higher cost effectiveness. So, it didn’t necessarily cause us to change how we’re expecting to spend down in the immediate term, just because we’re still in the regime where we’re trying to find grantees that are on target with what we want to fund, and are focusing on existential risk as opposed to other problems. That’s a huge bottleneck to getting money out the door. It wasn’t like we were in a position where we were spending a lot and we realised, “Oh, actually, the last dollar’s good, so we should cut back or save”. We weren’t in that regime, and we knew we weren’t.

Ajeya Cotra: But the goal of this project was just to reduce uncertainty on whether we could. Like, say the longtermist bucket had all of the money, could it actually spend that? We felt much more confident that if we gave all the money to the near-termist side, they could spend it on stuff that broadly seemed quite good, and not like a Pascal’s mugging. We wanted to see what would happen if all the money had gone to the longtermist side. It’s like you were saying earlier, if a worldview is just getting zero marginal return, on its own perspective, from getting twice as much more money, then it just seems intuitively a lot less appealing to give that worldview more money. That was the main goal of this project.

Robert Wiblin: I guess you guys are going to keep researching this, and I suppose eventually at some point it will be like a publication that will lay this out?

Ajeya Cotra: Yeah. I mean, it might not be this particular thing. The last dollar question — and to a lesser extent, the allocation over time question — is just one that’s always on our minds. So, the more recent work that I’ve been doing ties into the AI timelines work that I just recently completed. It’s trying to do a last dollar cost-effectiveness estimate, but it’s less trying to look for the ‘GiveDirectly of longtermism’ — like one big expensive intervention — and more trying to think about 10 interventions that could each take a tenth of the money, and trying to be more like a best guess for what we actually spend the bulk of it on. And focused on AI as opposed to biosecurity, in this case.

Robert Wiblin: Does Open Phil keep track of the most important disagreements that different staff members have with one another? I’m just imagining presumably people have views that are all over the shop on this issue, and potentially on other ones as well. I guess I could imagine you guys being the sort of people who would have a huge spreadsheet…track all of these things and then take the median, or the harmonic mean, I don’t know.

Ajeya Cotra: Yeah. The harmonic mean.

Robert Wiblin: I don’t even know what that is. Sorry, go on.

Ajeya Cotra: I wish we had more capacity to do this. I think GiveWell does this a lot. GiveWell, every year they have their charity recommendations, and there are these thorny questions of values and how to interpret ambiguous research. There are 10 researchers in a room arguing about them, and then they put out a spreadsheet that has columns for all 10 individuals and their disagreements. GiveWell usually reports the median.

Ajeya Cotra: We don’t really have that kind of system for most things, just because Open Phil is significantly more siloed, and we’re spread out over more quite diverse topics. There are only a couple of people who have their head in each of these topics at a time. With AI timelines, there are three or four people that have had their head at least somewhat in that. And only two of those people, say, are like really deeply in it.

Ajeya Cotra: And similarly with biosecurity, there are only two or three people who think about it, and only one person who’s really deeply in it. There’s a lot more deference going on across major areas than would be ideal if we had more staff and more ability to give the GiveWell treatment to each thing. Within a particular area, when it’s important and there’s a largish number of people with some amount of expertise, we try to get polls and get estimates from a lot of different people. One area where we’re able to do this more — because we have more people who think about it — is with respect to the EA community.

Ajeya Cotra: We are experimenting with having more voices in grant-making decisions within the EA community. But most areas don’t really have that, and we’re not sure that experiment actually leads to better, more efficient decisions. It’s still up in the air.

Robert Wiblin: Yep. I might wrap this up because I guess listeners who are interested in this timing-of-giving and ‘patient philanthropy’ stuff can go and listen to the interview with Phil Trammell from earlier in the year, where we go through a lot of these considerations very forensically; consider them very patiently. I guess there’ll probably be some blog posts on this topic from Open Phil in coming years because it seems like it’s going to be an important topic for you guys to figure out over the very long term.

What it’s like working at Open Phil [02:45:18]

Robert Wiblin: Before we finish, I would like to get in some discussion of what it’s like working at Open Phil, and I guess what the opportunities are at the moment. I think last time we talked about this on the show with somebody who works at Open Phil was two years ago. And since then I know the organisation has grown pretty substantially, so maybe that has shifted the culture and what it’s like to be there. So yeah. How have things changed over the last couple of years? I guess you’ve been there for four years now?

Ajeya Cotra: We started off, I would say, on a trajectory of being much more collaborative — and then COVID happened. The recent wave of hiring was a lot of generalist hires, and I think that now there’s more of a critical mass of generalists at Open Phil than there was before. Before I think there were only a few, now they’re more like 10-ish people. And it’s nice because there’s a lot more fluidity on what those people work on. And so there are a lot more opportunities for casual one-off collaboration than there is between the program staff with each other or the generalists with the program staff.

Ajeya Cotra: So a lot of the feeling of collaboration and teamyness and collegiality is partly driven by like, does each part of this super siloed organisation have its own critical mass. And I feel like the answer is no for most parts of the organisation, but recently the generalist group of people — both on the longtermist and near-termist side together — have more people, more opportunities for ideas to bounce, and collaborations that make sense, than there were before. And I’m hoping as we get bigger and as each part gets bigger, that’ll be more and more true.

Robert Wiblin: I guess, as organisations become bigger, things tend to become a bit more organised and standardised and bureaucratised, which has its good sides and also has its bad sides. Has that been the case with Open Phil as well? Or are there a sufficient number of small cells so that it actually still feels like a small organisation?

Ajeya Cotra: Yeah. So I think a lot of my day-to-day feels like a pretty small organisation still, but even in a pretty siloed organisation, there are some things that it’s important to hammer out as we get to the scale we’re at, which is 45-ish people and beyond. So we’re working actively on making Open Phil more professionalised, in the sense of like, especially clearer standards for performance and promotion, and fairer compensation across the different areas. So like, what does it mean to be a program associate in farm animal welfare versus effective altruism versus science or criminal justice reform. These focus areas have different needs, and different ways they operate within their fields, but we still want it to be fair that if you are a senior program associate, and you look around and you’re wondering, why is this other person a program officer instead of a senior program associate? Or why is this other person a program associate instead of senior program associate… You don’t want it to be the case that people can look to the left and look to the right and see people doing what they feel like is their same job, but are compensated differently for that. So thinking carefully about that is one of the things we’re aiming to do over the next year or two.

Robert Wiblin: What do you like and dislike most about your job?

Ajeya Cotra: Likes…obviously the mission, and I think my colleagues are just incredibly thoughtful and kind people that I feel super value-aligned with. And that’s awesome. And then dislikes, it comes back to the thing I was saying about how it’s a pretty siloed organisation. So each particular team is quite small, and then within each team, people are spread thin. So there’s one person thinking about timelines and there’s one person thinking about biosecurity, and it means the collaboration you can get from your colleagues — and even the feeling of team and the encouragement you can get from your colleagues — is more limited. Because they don’t have their head in what you’re up to. And it’s very hard for them to get their head in what you’re up to. And so people often find that people don’t read their reports that they worked really hard on as much as they would like, except for their manager or a small set of decision makers who are looking to read that thing.

Ajeya Cotra: And so I think that can be disheartening. And then in terms of my particular job, all this stuff I was saying… It’s very stressful putting together this report, in a lot of the ways that we were talking about earlier. And just feeling responsible for coming to a bottom-line number without a lot of feedback or a lot of diffusion of responsibility that comes from a bunch of people putting in the numbers. And like…

Robert Wiblin: That seems particularly hard.

Ajeya Cotra: That’s quite stressful. And it’s quite stressful to basically be doing work where you are just inevitably going to miss your deadlines a bunch. You’re inevitably going to think, I know what I’m talking about, I’m going to write it down, but actually you didn’t and you aren’t, and you’re going to have to push and you’re going to have to push many times over. That can be disheartening, but I think just being aware of the dynamic has been helpful for me.

Robert Wiblin: Yeah. From memory, when Open Phil was hiring a couple of years ago, I think a thousand people applied for a bunch of jobs and then 10 people got trials, and something like five people actually got hired? So those are harsh odds. Is there anything you can say to people who I guess either don’t think it’s possible they’ll get hired by Open Phil and maybe were a bit disappointed by that, or have applied and maybe didn’t manage to get a trial?

Ajeya Cotra: Yeah. I guess my first thought is that Open Phil is not people’s only opportunity to do good. Even doing generalist research of the kind that I think Open Phil does a lot of, especially for that kind of research, I think it’s a blessing and a curse, but you just need a desk and a computer to do it. I would love to see people giving it a shot more, and I think it’s a great way to get noticed. So when we write reports, all the reports we put out recently have long lists of open questions that I think people could work on. And I know of people doing work on them and that’s really exciting to me. So that’s one way to just get your foot in the door, both in terms of potentially being noticed at a place like Open Phil or a place like FHI or GPI, and also just get a sense of what does it feel like to do this? And do you like it? Or are the cons outweighing the pros for you?

Ajeya Cotra: In terms of generalist roles, that’s one thought I have. And then on a more procedural note, Open Phil is trying to be more forward-looking and long-term and patient with our recruiting pipeline. So we have a general application up, where even if you’re happy with where you’re at now and you want to stay at your current job for a couple of years, if you’re interested in eventually making a transition into this type of work, feel free to drop your name on the general application and then say what types of roles you might be interested in. And that’s a good way to just stay in touch and stay on our radar.

Robert Wiblin: Yeah. One thing I very often say to people who are disappointed when they’ve applied for a job and they haven’t gotten it is just that it’s very natural to take that as a personal insult, that you weren’t good enough, but very often the most important thing is the fit between the person, the role, the organisation, the people they’ll be working with, and what they know. And that stuff could just be extremely specific. There are brilliant, incredibly smart people out there who just aren’t a good fit for working at Open Phil. And that’s not a dump on them, it’s just that they should be doing something else where they’re just more likely to flourish, because Open Phil has this very particular culture, which I guess — as we’ve just heard — is challenging in some ways. It’s like, it’s not all a bed of roses. There’s also ways in which it’s challenging work intellectually and emotionally.

Ajeya Cotra: Yeah. I think the emotional element is really big there. I think it’s a certain disposition of…the cocktail of being arrogant enough and weird enough to think that you could answer these big questions, but also being finicky enough and particular enough about dotting Is and crossing Ts that you can make those weird areas just one notch more rigorous than they were before. But not 10 notches, because otherwise you’re going to be working on it for 20 years. It’s like some kind of epistemic culture that’s very contingent that we think some people fit into that helps thread that needle. But there are other places that want to be on a more bold, innovative, weird ‘arrogant’ side of the spectrum, and places that want to be on the more careful, rigorous, complete side of the spectrum too. And that just changes what you can work on and what frontier you’re operating at, basically.

Robert Wiblin: Yeah. So I think you were just about to say this earlier, but I guess for listeners who do think that this sounds like something that they’d be interested in, where they do have the right level of attention to detail and persistence — but not too much persistence — how can they get on…

Ajeya Cotra: Not too much persistence!

Robert Wiblin: Yeah, laziness can be a virtue. In fact, very often it is a virtue. How can they get on Open Phil’s radar or stay abreast of opportunities to potentially meet people and escalate their involvement?

Ajeya Cotra: Yeah. So please do drop your name on our general application, and we can put up a link to that on the podcast page. And research is definitely not all Open Phil does, there’s grantmaking stuff too, but in terms of the things I’m working on and know best, I do think it’s possible to try it out with these open questions that we list on our other reports. And also just reading stuff written by FHI or at GPI, and thinking about, is there a piece of this I can break off? Something that seems intrinsically interesting to me, where I could make a unit of progress and put it up on the EA Forum, put it up on LessWrong, put it up on the Alignment Forum? I think that’s a great way to just straight up add value and also get noticed by this ecosystem of organisations that are doing this work.

Robert Wiblin: Yeah. We’ve recently been trialing someone, and I think maybe the reason that they stood out was just the incredible compilation of work that they had on their personal website across a whole bunch of different… Like writing that they’d done, artistic stuff that they’d done, audio stuff that they’d done as well. They’d just shown a persistent interest and ability to produce interesting stuff. So that’s definitely one way to stand out from the crowd, because for some reason, most people don’t have that.

Ajeya Cotra: A great thing to do actually is just explaining stuff that other people have said and didn’t explain very well. That can be great for learning and for teaching and for demonstrating the thinking that would allow you to do more original research down the line. And this can be in any format, like writing up explainers. There was a great explainer on LessWrong about the scaling laws you were talking about actually, that was really helpful. And you could make YouTube videos explaining things, like the Robert Miles AI videos are great. So that stuff, it doesn’t have to be pushing forward the frontier. You can still both add value and really make yourself stand out with explainers.

Robert Wiblin: Yeah. One further question before you go, I’ve got the weekend coming up and not a whole lot planned. Are there any good movies or TV shows you’ve seen recently that you can recommend to me? The weather’s looking pretty grim, so I guess I’m going to be indoors.

Ajeya Cotra: So I have started… This is very redundant with everything everyone else has told you I’m sure, but I’ve started The Queen’s Gambit. I’m two episodes in, and I quite like it so far. It’s basically like a sports movie wrapped in a prestige TV wrapper about this girl that’s on the rise to superstardom in chess. And I also like… This is much less prestige, but I also like another Netflix show called Girlfriends’ Guide to Divorce, which I’m finding very entertaining. It’s about snobby L.A. housewives going through divorce.

Robert Wiblin: Nice. Yeah. Netflix has been pushing Queen’s Gambit so hard. Every time I open it, it’s like they’re insisting that I watch this thing. I feel like they’re going to cancel my subscription if I opt out of it.

Ajeya Cotra: Also every single person in my life too. My friends and my partners, aunts, boyfriend, they’re all in it. So I caved. It’s good.

Robert Wiblin: It’s good. Okay. Listeners, if you’re in the same situation as me this weekend… I watched Knives Out this week, which is a murder mystery with various different twists…

Ajeya Cotra: Did you know they’re making a sequel? It’s going to be a whole Benoit Blanc series. It’s going to be like Hercule Poirot. It’s going to be so big.

Robert Wiblin: Oh, wow. Yeah, no, it’s a really good character. Daniel Craig doing a Southern American accent rubbed me the wrong way for the first 10 minutes, but then I just rolled with it.

Ajeya Cotra: It’s got a great all-star cast too. Yeah. That movie really launched me on the search for the perfect murder mystery movies. I think it’s really slim pickings in terms of good murder mysteries that are just about the mystery, instead of a character study or something else.

Robert Wiblin: Yeah. The story is impressive. I can see why so many top actors signed on, because they would have read that script and been like wow, this is really cool.

Ajeya Cotra: Yeah, totally.

Robert Wiblin: Alright. Well with that out of the way, we’ve covered some fun stuff here and some pretty dense stuff, but I think I understand all of these topics a bunch better now. My guest today has been Ajeya Cotra. Thanks so much for coming on the 80,000 Hours Podcast, Ajeya.

Ajeya Cotra: Yeah. Thanks so much for having me.

Rob’s outro [02:57:57]

As I mentioned in the middle of the interview, if you’re someone who could see themselves going into a career path like the one that Ajeya has had, or potentially, like some of the careers that other guests have had over the years, and you’d like a bit of help figuring out exactly where you should maybe aspire to end up and how you might get there, you should check out our 1-1 advising service.

There’s more information about that at 80000hours.org/advising. There are some free slots available at the moment, which is why we’re putting out this advertisement. And at that address, you can find out about the kinds of problems and people who we can do the most to help and the kinds of questions and people who we sometimes struggle to help and so we’re less likely to be able to advise.

But if you’ve found this interview engaging and are interested to hear more things like this, then that’s a good sign that you might be a good fit for our 1-1 advising service. So go take a look at 80000hours.org/advising and don’t be shy about applying if you think it would be helpful to you.

The 80,000 Hours Podcast is produced by Keiran Harris.

Audio mastering by Ben Cordell.

Full transcripts are available on our site and made by Sofia Davis-Fogel.

Thanks for joining, talk to you again soon.

Learn more

Foundation grantmaker

Longtermism: a call to protect future generations

The case for reducing existential risks

Global priorities research

Related episodes

February 27, 2018

#21 – Holden Karnofsky on times philanthropy transformed the world & Open Phil's plan to do the same

Listen now

October 11, 2017

#10 – Nick Beckstead on how to spend billions of dollars preventing human extinction

Listen now

March 17, 2020

#73 – Phil Trammell on patient philanthropy and waiting to do good

Listen now

October 17, 2018

#45 – Tyler Cowen's stubborn attachments to maximising economic growth, making civilization more stable & respecting human rights

Listen now

August 5, 2019

#62 – Paul Christiano on messaging the future, increasing compute, & how CO2 impacts your brain

Listen now

August 28, 2018

#41 – David Roodman on incarceration, geomagnetic storms, & becoming a world-class researcher

Listen now

About the show

The 80,000 Hours Podcast features unusually in-depth conversations about the world's most pressing problems and how you can use your career to solve them. We invite guests pursuing a wide range of career paths — from academics and activists to entrepreneurs and policymakers — to analyse the case for and against working on different issues and which approaches are best for solving them.

The 80,000 Hours Podcast is produced and edited by Keiran Harris. Get in touch with feedback or guest suggestions by emailing [email protected].

What should I listen to first?

We've carefully selected 10 episodes we think it could make sense to listen to first, on a separate podcast feed:

Check out 'Effective Altruism: An Introduction'

Subscribe here, or anywhere you get podcasts:

If you're new, see the podcast homepage for ideas on where to start, or browse our full episode archive.

On this page:

Highlights

Worldview diversification

Effective size of the far future

Why Ajeya wrote her AI timelines report

Biggest challenges with writing big reports

What it's like working at Open Phil

Articles, books, and other media discussed in the show

Transcript

Rob’s intro [00:00:00]

The interview begins [00:03:23]

Worldview diversification [00:08:45]

Science and policy funding [00:23:10]

Fairness agreements [00:27:50]

Next best worldviews [00:41:05]

Pragmatic reasons to spread across different areas [00:47:39]

Effective size of the long-term future [00:57:19]

The doomsday argument [01:09:37]

The simulation argument [01:16:58]

AI timelines report [01:29:24]

Recent AI developments [01:39:28]

Four key probability distributions [01:46:54]

Most likely ways to be wrong [02:11:43]

Biggest challenges with writing big reports [02:17:09]

Last dollar project [02:25:28]

What it’s like working at Open Phil [02:45:18]

Rob’s outro [02:57:57]

Learn more

Foundation grantmaker

Longtermism: a call to protect future generations

The case for reducing existential risks

Global priorities research

Related episodes

#21 – Holden Karnofsky on times philanthropy transformed the world & Open Phil's plan to do the same

#10 – Nick Beckstead on how to spend billions of dollars preventing human extinction

#73 – Phil Trammell on patient philanthropy and waiting to do good

#45 – Tyler Cowen's stubborn attachments to maximising economic growth, making civilization more stable & respecting human rights

#62 – Paul Christiano on messaging the future, increasing compute, & how CO2 impacts your brain

#41 – David Roodman on incarceration, geomagnetic storms, & becoming a world-class researcher

About the show

What should I listen to first?