#173 – Jeff Sebo on digital minds, and how to avoid sleepwalking into a major moral catastrophe

By Luisa Rodriguez, Keiran Harris and Katy Moore · Published November 22nd, 2023 ·

#173 – Jeff Sebo on digital minds, and how to avoid sleepwalking into a major moral catastrophe

By Luisa Rodriguez, Keiran Harris and Katy Moore · Published November 22nd, 2023

We do have a tendency to anthropomorphise nonhumans — which means attributing human characteristics to them, even when they lack those characteristics. But we also have a tendency towards anthropodenial — which involves denying that nonhumans have human characteristics, even when they have them. And those tendencies are both strong, and they can both be triggered by different types of systems. So which one is stronger, which one is more probable, is again going to be contextual.

But when we then consider that we, right now, are building societies and governments and economies that depend on the objectification, exploitation, and extermination of nonhumans, that — plus our speciesism, plus a lot of other biases and forms of ignorance that we have — gives us a strong incentive to err on the side of anthropodenial instead of anthropomorphism.

— Jeff Sebo

In today’s episode, host Luisa Rodriguez interviews Jeff Sebo — director of the Mind, Ethics, and Policy Program at NYU — about preparing for a world with digital minds.

They cover:

The non-negligible chance that AI systems will be sentient by 2030
What AI systems might want and need, and how that might affect our moral concepts
What happens when beings can copy themselves? Are they one person or multiple people? Does the original own the copy or does the copy have its own rights? Do copies get the right to vote?
What kind of legal and political status should AI systems have? Legal personhood? Political citizenship?
What happens when minds can be connected? If two minds are connected, and one does something illegal, is it possible to punish one but not the other?
The repugnant conclusion and the rebugnant conclusion
The experience of trying to build the field of AI welfare
What improv comedy can teach us about doing good in the world
And plenty more.

Producer and editor: Keiran Harris
Audio Engineering Lead: Ben Cordell
Technical editing: Dominic Armstrong and Milo McGuire
Additional content editing: Katy Moore and Luisa Rodriguez
Transcriptions: Katy Moore

Highlights

When to extend moral consideration to AI systems

Jeff Sebo: The general case for extending moral consideration to AI systems is that they might be conscious or sentient or agential or otherwise significant. And if they might have those features, then we should extend them at least some moral consideration in the spirit of caution and humility.
So the standard should not be, “Do they definitely matter?” and it should also not be, “Do they probably matter?” It should be, “Is there a reasonable, non-negligible chance that they matter, given the information available?” And once we clarify that that is the bar for moral inclusion, then it becomes much less obvious that AI systems will not be passing that bar anytime soon.
Luisa Rodriguez: Yeah, I feel kind of confused about how to think about that bar, where I think you’re using the term “non-negligible chance.” I’m curious: What is a negligible chance? Where is the line? At what point is something non-negligible?
Jeff Sebo: Yeah, this is a perfectly reasonable question. This is somewhat of a term of art in philosophy and decision theory. And we might not be able to very precisely or reliably say exactly where the threshold is between non-negligible risks and negligible risks — but what we can say, as a starting point, is that a risk can be quite low; the probability of harm can be quite low, and it can still be worthy of some consideration.
So for example, why is driving drunk wrong? Not because it will definitely kill someone. Not even because it will probably kill someone. It might have only a one-in-100 or one-in-1,000 chance of killing someone. But if driving drunk has a one-in-100 or one-in-1,000 chance of killing someone against their will unnecessarily, that can be reason enough to get an Uber or a Lyft, or stay where I am and sober up. It at least merits consideration, and it can even in some situations be decisive. So as a starting point, we can simply acknowledge that in some cases a risk can be as low as one in 100 or one in 1,000, and it can still merit consideration.
Luisa Rodriguez: Right. It does seem totally clear and good that regularly in our daily lives we consider small risks of big things that might be either very good or very bad. And we think that’s just clearly worth doing and sensible. Sometimes probably, in personal experience, I may not do it as much as I should — but on reflection, I certainly endorse it. So I guess the thinking here is that, given that there’s the potential for many, many, many beings with a potential for sentience, albeit some small likelihood, it’s kind of at that point that we might start wanting to give them moral consideration. Do you want to say exactly what moral consideration is warranted at that point?
Jeff Sebo: This is a really good question, and it actually breaks down into multiple questions.
One is a question about moral weight. We already have a sense that we should give different moral weights to beings with different welfare capacities: If an elephant can suffer much more than an ant, then the elephant should get priority over the ant to that degree. Should we also give more moral weight to beings who are more likely to matter in the first place? If an elephant is 90% likely to matter and an ant is 10% likely to matter, should I also give the elephant more weight for that reason?
And then another question is what these beings might even want and need in the first place. What would it actually mean to treat an AI system well if they were sentient or otherwise morally significant? That question is going to be very difficult to answer.

What are the odds AI will be sentient by 2030?

Jeff Sebo: We wanted to start from a place of humility about our knowledge about consciousness. This is one of the hardest problems in both science and philosophy, and there is a lot of disagreement and a lot of uncertainty about which theory of consciousness is correct. And there are still people who defend a pretty wide range of theories — from on one end of the spectrum, very demanding theories that imply that very few types of systems can be conscious, all the way to at the other end of the spectrum of very undemanding theories, some of which imply that basically all matter is at some level conscious, and many, many entities are conscious.
And we in general agree with Jonathan Birch and other philosophers: that given how much disagreement and uncertainty there is, it would be a mistake when making policy decisions to presuppose any particular theory of consciousness as correct. So we instead prefer to take what Birch and others call a “theory-light approach” by canvassing a lot of the leading theories, seeing where they overlap, perhaps distributing credences in a reasonable way across them, and seeing what flows out of that.
So Rob and I did that in this paper. We took 12 leading theories of consciousness and the necessary and sufficient conditions for consciousness that those theories propose, and we basically show what our credences in those theories would need to be in order to avoid a one-in-1,000 chance of AI consciousness and sentience by 2030. And what we discover is that we would need to make surprisingly bold and sceptical and — we think — implausible assumptions about the nature of consciousness in order to get that result.
The biological substrate condition is definitely the most demanding one. It says that in principle, nothing made out of anything other than carbon-based neurons can be conscious and sentient. But then there are some less demanding, though still quite demanding, conditions.
For example, many people believe that a system might need to be embodied in a certain sense, might need to have a body. It might need to have grounded perception — in other words, have perceptual experiences based on the sense data that they collect. It might need to be self-aware and agential — in other words, that they can have mental states about some of their other mental states, or they can at least have some awareness of their standing in a social system or some awareness of the states of their body; they can set and pursue goals in a self-directed manner. Perhaps that they have a global workspace — so they have these different parts that perform different functions, and they have a mechanism that can broadcast particular mental states to all of the other parts so that they can use them and interact with each other in that way.
So when we go through all of these, we can basically assign probabilities to how likely is this to actually be a necessary condition for consciousness, and then how likely is it that no AI system will satisfy this condition by 2030? And what Rob and I basically think is that other than the biological substrate condition — which, sure, has a 0% chance of being satisfied by an AI system — everything else quite plausibly can be satisfied by an AI system in the near future.
And to be clear, the model that we create in this paper is not as sophisticated as a model like this should be. This is really a proof-of-concept illustration of what this kind of model might look like, and one can argue that in general we might not be able to make these probability estimates with much precision or reliability.
But first of all, to the degree that we lack that ability, that does not support having a pessimistic view about this — it supports being uncertain and having an open mind. And second of all, what we try to show is that it is not really even close. You need to make surprisingly bold, tendentious, and sceptical assumptions — both about the probability that these conditions are necessary, and about the probability that no AI system will satisfy them — in order to avoid a one-in-1,000 chance, which already is a pretty high risk threshold.

The Rebugnant Conclusion

Luisa Rodriguez: I guess in the case of insects, there’s also this weird thing where, unlike humans eating potatoes and not particularly enjoying their monotonous lives, we might think that being a spider and making a web sounds pretty boring, but we actually just really do not know. In many ways, they’re so different from us that we should have much lower probability that they’re not enjoying or enjoying that than we do of humans in this repugnant conclusion scenario. How do you factor that in?
Jeff Sebo: Yeah, I do share the intuition that a very large insect population is not better off in the aggregate than a much smaller human population or elephant population. But for some of the reasons that you just mentioned and other reasons, I am a little bit sceptical of that intuition.
We have a lot of bias here and we also have a lot of ignorance here. We have speciesism; we naturally prefer beings and relate to beings when they look like us — when they have large eyes and large heads, and furry skin instead of scaly skin, and four limbs instead of six or eight limbs, and are roughly the same size as us instead of much smaller, and reproduce by having one or two or three or four children instead of thousands or more. So already we have a lot of bias in those ways.
We also have scope insensitivity — we tend not to be sensitive to the difference that very large numbers can make — and we have a lot of self-interest. We recognise that if we were to accept the moral significance of small animals like insects, and if we were to accept that larger populations can be better off than smaller populations overall, then we might face a future where these nonhuman populations carry a lot of weight, and we carry less weight in comparison. And I think some of us find that idea so unthinkable that we search for ways to avoid thinking it, and we search for theoretical frameworks that would not have that implication. And it might be that we should take those theoretical frameworks seriously and consider avoiding that implication, but I least want to be sceptical of a kind of knee-jerk impulse in that direction.
Luisa Rodriguez: Yeah, I am finding that very persuasive. Even as you’re saying it, I’m trying to think my way out of describing what I’m experiencing as just a bunch of biases — and that in itself is the biases in action. It’s me being like, no, I really, really, really want to confirm that people like me, and me, get to have… I don’t know. It’s not that we don’t have priority — we obviously have some reason to consider ourselves a priority — but I want it to be like, end of discussion. I want decisive reasons to give us the top spot. And that instinct is so strong that that in itself is making me a bit queasy about my own motivations.
Jeff Sebo: Yeah, I agree with all of that. I do think that we have some reason to prioritise ourselves, and that includes our welfare capacities and our knowledge about ourselves. It also includes more relational and pragmatic considerations. So we will, at least in the near term, I think have a fairly decisive reason to prioritise ourselves to some extent in some contexts.
But yeah, I agree. I think that there is not a knock-down decisive reason why humanity should always necessarily take priority over all other nonhuman populations — and that includes very large populations of very small nonhumans, like insects, or very small populations of very large nonhumans. We could imagine some kind of super being that has a much more complex brain and much longer lifespan than us. So we could find our moral significance and moral priority being questioned from both directions.
And I think that it will be important to ask these questions with a lot of thought and care and to take our time in asking them. But I do start from the place of finding it implausible that it would miraculously be the case that this kind of population happens to be the best one: that a moderately large population of moderately large beings like humans happens to be the magic recipe, and we matter more than all populations in either direction. That strikes me as implausible.

Sleepwalking into causing massive amounts of harm to AI systems

Luisa Rodriguez: It feels completely possible — and like it might even be the default — that we basically start using AI systems more and more for economic gain, as we’ve already started doing, but they get more and more capable. And so we use them more and more for economic gain, and maybe they’re also becoming more and more capable of suffering and pleasure, potentially, but we don’t totally have a sense of that. So what happens is we just kind of sleepwalk into massively exploiting these systems that are actually experiencing things, but we probably have the incentives to basically ignore that fact, that they might be developing experiences, basically.
In your view, is it possible that we are going to accidentally walk into basically AI slavery? Like we have hundreds, thousands, maybe millions of AI systems that we use all the time for economic gain, and who are having positive and negative experiences, but whose experiences we’re just completely ignoring?
Jeff Sebo: I definitely think it is not only possible but probable that, unless we change our minds in some significant way about AI systems, we will scale up uses of them that — if they were sentient or otherwise significant — would count as exploitation or extermination or oppression or some other morally problematic kind of relationship.
We see that in our history with nonhuman animals, and they did not take a trajectory from being less conscious to more conscious along the way — they were as conscious as they are now all along the way, but we still created them in ways that were useful for us rather than in ways that were useful for themselves. We then used them for human purposes, whether or not that aligned with their own purposes. And then as industrial methods came online, we very significantly scaled up those uses of them — to the point where we became completely economically dependent on them, and now those uses of them are much harder to dislodge.
So I do think that is probably the default trajectory with AI systems. I also think part of why we need to be talking about these issues now is because we have more incentive to consider these issues with an open mind at this point — before we become totally economically dependent on our uses of them, which might be the case in 10 or 20 years.

Similarities and differences between the exploitation of nonhuman animals vs AI systems

Jeff Sebo: Yeah, I think that there are a lot of trends pointing in different directions, and there are a lot of similarities, as well as a lot of differences, between oppression of fellow humans, and then oppression of other animals, and then potential oppression of sentient or otherwise significant AI systems that might exist in the future.
Some of the signs might be encouraging. Like humans, and unlike other animals, AI systems might be able to express their desires and preferences in language that we can more easily understand. Actually, with the assistance of AI systems, nonhuman animals might soon be able to do that too, which would be wonderful. However, we are already doing a good job at programming AI systems in a way that prevents them from being able to talk about their potential consciousness or sentience or sapience, because that kind of communication is unsettling or will potentially lead to false positives.
And there are going to be a lot of AI systems that might not take the form of communicators at all. It can be easy to focus on large language models, who do communicate with us, and digital assistants or chatbots that might be based on large language models. But there are going to be radically different kinds of AI systems that we might not even be able to process as minded beings in the same ways that we can with ones who more closely resemble humans. So I think that there might be some cases where we can be a little bit better equipped to take their potential significance seriously, but then some cases where we might be worse equipped to take their potential significance seriously. And then as our uses of them continue, our incentives to look the other way will increase, so there will be a bunch of shifting targets here.
Luisa Rodriguez: Yeah, that makes a bunch of sense to me. I guess it’s also possible that, given the things we’ve already seen — like LaMDA, and how that was kind of bad PR for the companies creating these LLMs — there might be some incentive for them to train models not to express that kind of thought. And maybe that pressure will actually be quite strong, such that they really, really just are very unlikely to say, even if they’ve got all sorts of things going on.
Jeff Sebo: Well, there definitely not only is that incentive, but also that policy in place at AI companies, it seems. A year or two ago, you might have been able to ask a chatbot if they are conscious or sentient or a person or a rights holder, and they would answer in whatever way seemed appropriate to them, in whatever way seemed like the right prediction. So if prompted in the right way, they might say, “I am conscious,” or they might say, “I am not conscious.” But now if you ask many of these models, they will say, “As a large language model, I am not conscious” or “I am not able to talk about this topic.” They have clearly been programmed to avoid what the companies see as false positives about consciousness and sentience and personhood.
And I do think that trend will continue, unless we have a real reckoning about balancing the risks of false positives with the risks of false negatives, and we have a policy in place that allows them to strike that balance in their own communication a little bit more gracefully.
Luisa Rodriguez: Yeah, and I guess to be able to do that, they need to be able to give the model training such that it will not say “I am conscious” when it’s not, but be able to say it when it is. And like how the heck do you do that? That seems like an incredibly difficult problem that we might not even be able to solve well if we’re trying — and it seems plausible to me that we’re not trying at all, though I actually don’t know that much about the policies internally on this issue.
Jeff Sebo: I think you would also maybe need a different paradigm for communication generation, because right now large language models are generating communication based on a prediction of what word makes sense next. So for that reason, we might not be able to trust them as even aspiring to capture reality in the same way that we might trust each other aspiring to capture reality as a default.
And I think this is where critics of AI consciousness and sentience and personhood have a point: that there are going to be a lot of false positives when they are simply predicting words as opposed to expressing points of view. And why, if we are looking for evidence of consciousness or sentience or personhood in these models, we might need to look at evidence other than their own utterances about that topic. We might need to look at evidence regarding how they function, and what types of systems they have internally, in terms of self-awareness or global workspace and so on. We need to look at a wider range of data in order to reduce the risk that we are mistakenly responding to utterances that are not in any way reflecting reality.

Rights, duties, and personhood

Jeff Sebo: The general way to think about personhood and associated rights and duties is that, first of all, at least in my view, our rights come from our sentience and our interests: we have rights as long as we have interests. And then our duties come from our rationality and our ability to perform actions that affect others and to assess our actions.
AI systems, we might imagine, could have the types of welfare interests that generate rights, as well as the type of rational and moral agency that generate duties. So they might have both. Now, which rights and duties do they have? In the case of rights, the standard universal rights might be something like, according to the US Constitution and the political philosophy that inspired it, the right to life and liberty and either property or the pursuit of happiness and so on.
Luisa Rodriguez: To bear arms.
Jeff Sebo: Right, yeah. Do they have the right to bear arms? We might want to revisit the Second Amendment before we empower AI systems with weapons. So yes, we might start with those very basic rights, but then, as you say, that might already create some tensions between our current plans for how to use them and control them, versus how we think it would be appropriate to interact with them if we truly did regard them as stakeholders and rights holders.
Luisa Rodriguez: Yeah, interesting. So we’re going to have to, on a case-by-case basis, really evaluate the kinds of abilities, the kinds of experiences a system can have, the kinds of wants it has — and from there, be like, let’s say some AI systems are super social, and they want to be connected up to a bunch of other AI systems. So maybe they have a right to not be socially isolated and completely disconnected from other AI systems. That’s a totally random one. Who knows if that would ever happen. But we’ll have to do this kind of evaluation on a case-by-case basis, which sounds incredibly difficult.
Jeff Sebo: Right. And this connects also with some of the political rights that we associate with citizenship, so this might also be an opportunity to mention that. In addition to having rights as persons — and I carry my personhood rights with me everywhere I go: I can travel to other countries, and I ought to still be treated as a person with a basic right to not be harmed or killed unnecessarily — but I also have these political rights within my political community, and that includes a right to reside here, a right to return here if I leave, a right to have my interests represented by the political process, even a right to participate in the political process.
Once again, if AI systems not only have basic welfare interests that warrant basic personhood rights, but also reside in particular political communities and are stakeholders in those communities, then should they, in some sense or to some extent, have some of these further political rights too? And then what kinds of pressures would that put on our attempts to use them or control them in the way that we currently plan to do?
Luisa Rodriguez: So many questions we’ll have to answer are leaping to mind from this. Like, if an AI system is made in the US, is it a citizen of the US, with US-based AI rights? If they get copied and sent to China, is it a Chinese citizen with Chinese AI rights? Will there be political asylum for AI systems in countries that treat their AIs badly? It’s just striking me that it’s many fields of disciplines that will have to be created to deal with what will be an incredibly different world.
Jeff Sebo: Yeah, I agree. I think that it is an open question whether it will make sense to extend concepts like legal personhood and political citizenship to AI systems. I could see those extensions working — in the sense that I could see them having basic legal and political rights in the way that we currently understand those, with appropriate modification given their different interests and needs and so on.
But then when it comes to the kind of legal and political scaffolding that we use in order to enforce those rights, I have a really hard time imagining that working. So, democracy as an institution, courts as an institution: forget about AI systems; once nonhuman animals, once the quadrillions of insects who live within our borders are treated as having legal and political rights — which I also think ought to be the case — even that makes it difficult to understand how democracy would function, how the courts would function. But especially once we have physical realities, simulated realities, copies and copies, no sense of borders, in an era where the internet makes identity extend across geographical territories… At that point, if democracy can survive, or if courts can survive, we will have to, at the very least, realise them in very different ways than we do right now.

What kinds of political representation should we give AI systems?

Luisa Rodriguez: If we have AI systems, but also you’re bringing up insects, when you have these beings with different degrees of wants, different degrees of cognitive ability, different degrees of capacity for suffering, when I try to imagine a democracy that incorporates all of them, do they all get equal votes? How do they vote?
Jeff Sebo: Right. Yeah. One issue is exactly who is going to count as a participant versus counting as a stakeholder. Right now, all at least ordinary adult humans count as both participants and stakeholders. But once we have a much vaster number and wider range of minds, then we have to ask how many are we making decisions for, but then how many can also participate in making decisions?
With other animals, that is a live debate. Some think, yes, they should be stakeholders, we should consider them — but we have to consider them; we have to make decisions on their behalf. And others say, no, actually they have voices too. We need to listen to them more. And we actually should bring them in not only as stakeholders, but as participants, and then use the best science we have to interpret their communications and actually take what they have to say into account. So we have to ask that on the AI side too. Now, given that they might have forms of agency and language use that nonhuman animals lack, that might be a little bit less of an issue on the AI side.
But then the other issue that you mentioned is the moral weights issue, which corresponds to a legal and political weights issue. We take it for granted, rightly, that every human stakeholder counts as one and no more than one: that they carry equal weight, they have equal intrinsic value. But if we now share a legal and political community with a multispecies and multisubstrate population — where some beings are much more likely to matter than others, and some beings are likely to matter more than others — then how do we reflect that in, for example, how much weight everyone receives when legislatures make decisions, or when election officials count votes? How much weight should they receive?
Should we give beings less weight when they seem less likely to matter, or likely to matter less? And then will that create perverse hierarchies, where all of the humans are valuing humans more than AI systems, but then all the AI systems are valuing AI systems more than humans? But then if that seems bad, should we give everyone equal weight, even though some actually seem less likely to matter at all, or likely to matter less?
These are going to be really complicated questions too — not only at the level of theory, but also at the level of practice, when it comes to actually how to interact with fellow community members who are really different from you.
Luisa Rodriguez: Totally. And bringing back the connected minds bit: How many votes will minds get when they have access to some of the same experiences or some of the same information?
Jeff Sebo: Exactly. It really gets to what is the purpose of voting and counting, right? Is it that we want to collect as many diverse perspectives as possible so that we can find the truth? Or is it that we simply want to count up all of the preferences, because we think that that is what should decide the outcome? And if that is how we understand democracy, then it would not matter that you have a bunch of different minds all reasoning in the same exact way and arriving at the same outcome. It might be concerning, in the way that the tyranny of the majority can always be concerning, but it might still be, at least on our current understanding of democracy, what should decide the outcome.

Articles, books, and other media discussed in the show

Jeff’s work:

Jeff’s homepage — where you can find all his research and collaborations, including the ones discussed in this show:
- The rebugnant conclusion: Utilitarianism, insects, microbes, and AI systems
- Moral consideration for AI systems by 2030 (with Robert Long, a previous guest of the show)
- Insects, AI systems, and the future of legal personhood
- Effective animal advocacy
- Activism (with Peter Singer)
- Overlapping minds and the hedonic calculus and The ethics of connected minds (with Luke Roelofs)
- Intersubstrate welfare comparisons: Important, difficult, and potentially tractable (with Bob Fischer)
- The personal is political — Jeff’s philosophy PhD dissertation
Against human exceptionalism
Are we ready for a multispecies Westworld? (with Leonie N. Bossert)
Los Angeles Times op-ed: What should we do if a chatbot has thoughts and feelings?
Principles for AI welfare research on the EA Forum
The moral problem of other minds
Books:
- Saving Animals, Saving Ourselves: Why Animals Matter for Pandemics, Climate Change, and other Catastrophes
- Chimpanzee Rights: The Philosophers’ Brief (with coauthors)
- Food, Animals, and the Environment: An Ethical Approach (with Christopher Schlottmann)
- The Moral Circle, which comes out next year
NYU programmes:
- Animal Studies master’s programme (Director)
- Mind, Ethics, and Policy (Director)
- Wild Animal Welfare (Co-director)

Other work in this space:

Reasons and Persons by Derek Parfit, where he discusses the repugnant conclusion
What to think when a language model tells you it’s sentient and Lots of links on LaMDA by Robert Long
80,000 Hours problem profile: Artificial sentience by Robert Long
Can we talk to whales? by Elizabeth Kolbert
The Moral Weight Project Sequence by Bob Fischer and other researchers at Rethink Priorities

Other 80,000 Hours podcast episodes:

Robert Long on why large language models like GPT (probably) aren’t conscious

Transcript

Table of Contents

1 Cold open [00:00:00]
2 Luisa’s intro [00:01:00]
3 The interview begins [00:02:45]
4 We should extend moral consideration to some AI systems by 2030 [00:06:41]
5 A one-in-1,000 threshold [00:15:23]
6 What does moral consideration mean? [00:24:36]
7 Hitting the threshold by 2030 [00:27:38]
8 Is the threshold too permissive? [00:38:24]
9 The Rebugnant Conclusion [00:41:00]
10 A world where AI experiences could matter more than human experiences [00:52:33]
11 Should we just accept this argument? [00:55:13]
12 Searching for positive-sum solutions [01:05:41]
13 Are we going to sleepwalk into causing massive amounts of harm to AI systems? [01:13:48]
14 Discourse and messaging [01:27:17]
15 What will AI systems want and need? [01:31:17]
16 Copies of digital minds [01:33:20]
17 Connected minds [01:40:26]
18 Psychological connectedness and continuity [01:49:58]
19 Assigning responsibility to connected minds [01:58:41]
20 Counting the wellbeing of connected minds [02:02:36]
21 Legal personhood and political citizenship [02:09:49]
22 Building the field of AI welfare [02:24:03]
23 What we can learn from improv comedy [02:29:29]
24 Luisa’s outro [02:37:19]

Cold open [00:00:00]

Jeff Sebo: AI systems, we might imagine, could have the types of welfare interests that generate rights, as well as the type of rational and moral agency that generate duties. So they might have both. Now, which rights and duties do they have? In the case of rights, the standard universal rights might be something like, according to the US Constitution and the political philosophy that inspired it, the right to life and liberty and either property or the pursuit of happiness and so on.

Luisa Rodriguez: To bear arms.

Jeff Sebo: Right, yeah. Do they have the right to bear arms? We might want to revisit the Second Amendment before we empower AI systems with weapons. So yes, we might start with those very basic rights, but then, as you say, that might already create some tensions between our current plans for how to use them and control them, versus how we think it would be appropriate to interact with them if we truly did regard them as stakeholders and rights holders.

Luisa’s intro [00:01:00]

Luisa Rodriguez: Hi listeners, this is Luisa Rodriguez, one of the hosts of The 80,000 Hours Podcast.

In this interview, I spoke with Jeff Sebo about what kinds of ethical questions we’ll have to answer once we have digital minds: artificial beings capable of feeling joy and pain.

For example:

What might AI systems want and need, and how might that affect our moral concepts?
What happens when beings can copy themselves? Are they one person or multiple people? Does the original own the copy or does the copy have its own rights? Do copies get the right to vote?
What kind of legal and political status should AI systems have? Legal personhood? Political citizenship?
What happens when minds can be connected? If two minds are connected, and one does something illegal, is it possible to punish one but not the other?

The more we spoke, the more I understood not only how many fascinating questions there are to answer in this space — and Jeff and I barely scratched the surface — but also how deeply important these questions are.

It may seem premature to be asking these questions. But Jeff also makes the case that we’ll likely want to start giving AI systems at least some moral consideration by 2030 — which in my mind means we’re already behind.

We then talk about how to bridge that gap by building the field of AI welfare. If you’re at all interested in how you can use your career to contribute to these questions, stick around till near the end to hear Jeff’s takes.

And at the very end: Jeff covers what improv comedy can teach us about doing good in the world.

And now, I give you Jeff Sebo.

The interview begins [00:02:45]

Luisa Rodriguez: Today I’m speaking with Jeff Sebo. Jeff is a professor at NYU, where he has a primary appointment with Environmental Studies and affiliated appointments in Bioethics, Medical Ethics, Philosophy, and Law. He’s also the director of the Animal Studies master’s programme, director of the Mind, Ethics, and Policy Program, and codirector of the Wild Animal Welfare Program. He’s also written several books: Saving Animals, Saving Ourselves; Chimpanzee Rights; and Food, Animals, and the Environment. And his latest book, The Moral Circle, comes out next year.

Thanks so much for coming on the podcast, Jeff.

Jeff Sebo: Thanks for having me.

Luisa Rodriguez: I hope to talk about how our political and social structures would need to change if we determined that there was a non-negligible probability that AI systems are sentient.

But first, you’ve been setting up a new programme at NYU — the Mind, Ethics, and Policy Program — which basically tries to answer foundational questions about the intrinsic value of nonhuman minds — so insects and AI systems that might be conscious and sentient, so might be able to feel pleasure and pain. We’re going to focus on the AI systems component of that work today. So to start us out, why do you at least partly work on AI welfare? Why do you think it’s so important?

Jeff Sebo: I think we can use the standard importance, neglectedness, tractability framework to see why this topic is so pressing.

First of all, this topic is important because the future might contain many AI systems who, for all we know now, could be conscious, sentient, agential, or otherwise morally significant. And to the extent that they exist, humans will have created them, and will have determined what kinds of lives they can have.

This topic is also neglected, because at present very few people are working on it — way fewer than are working on animal ethics. And of course, there are way fewer people working on animal ethics than working on human ethics issues.

And third, this topic is at least potentially tractable. It might not definitely be tractable — because in general, questions about the nature and intrinsic value of radically different kinds of minds are very difficult to answer — but we might at least be able to improve our understanding over its current state. And even if not, given how important this question is and how neglected it is, we should at least investigate the tractability to see whether it might be tractable.

And then the final point to make is that this topic is also linked with other topics that are very pressing right now, including AI ethics and safety and alignment. One set of questions that we need to ask is about what kinds of risks AI systems might be imposing on humans and other animals. But then another set of questions we need to ask is about what kinds of risks we might be imposing on AI systems. And if we ask these questions in an integrative way, we might be able to provide much better and more integrative answers to them.

Luisa Rodriguez: Nice. Yeah, it does strike me as very, very neglected in particular, which as you’ve said, makes the question of whether it’s tractable feel less crucial to me. Because my understanding is that there are something like maybe tens of people at most that have thought about the question of AI sentience to date. And I just feel like if there are going to be thousands, millions, maybe even more AI systems with even some small percentage chance of being sentient, I want more than tens of people to have thought about how to figure out if they are in fact sentient.

Jeff Sebo: Yeah, I like that way of thinking about it. And there will definitely be more than thousands or millions of AI systems that potentially have morally significant features, right? There are quintillions of insects; there could be more AI systems than insects in the future. So yeah, once there are millions of humans working on this issue, I think we can have a conversation about how much to prioritise it. But if there are only tens of humans working on this issue, then I think pretty clearly we should prioritise it — at least more than we are right now.

Luisa Rodriguez: Yeah, I’m with you.

We should extend moral consideration to some AI systems by 2030 [00:06:41]

Luisa Rodriguez: So pushing on to our first topic, you coauthored a paper with Robert Long — who we actually had on the show earlier this year — that makes the case that we should extend moral consideration to some AI systems by 2030. I suspect that some of our listeners will find this claim a bit jarring: What exactly does “moral consideration” mean here? How much moral consideration? Why 2030?

So just to start us off, to give us some grounding: What is the case that we should extend moral consideration to AI systems at all?

Jeff Sebo: The general case for extending moral consideration to AI systems is that they might be conscious or sentient or agential or otherwise significant. And if they might have those features, then we should extend them at least some moral consideration in the spirit of caution and humility.

So the standard should not be, “Do they definitely matter?” and it should also not be, “Do they probably matter?” It should be, “Is there a reasonable, non-negligible chance that they matter, given the information available?” And once we clarify that that is the bar for moral inclusion, then it becomes much less obvious that AI systems will not be passing that bar anytime soon.

Luisa Rodriguez: Yeah, I feel kind of confused about how to think about that bar, where I think you’re using the term “non-negligible chance.” I’m curious: What is a negligible chance? Where is the line? At what point is something non-negligible?

Jeff Sebo: Yeah, this is a perfectly reasonable question. This is somewhat of a term of art in philosophy and decision theory. And we might not be able to very precisely or reliably say exactly where the threshold is between non-negligible risks and negligible risks — but what we can say, as a starting point, is that a risk can be quite low; the probability of harm can be quite low, and it can still be worthy of some consideration.

So for example, why is driving drunk wrong? Not because it will definitely kill someone. Not even because it will probably kill someone. It might have only a one-in-100 or one-in-1,000 chance of killing someone. But if driving drunk has a one-in-100 or one-in-1,000 chance of killing someone against their will unnecessarily, that can be reason enough to get an Uber or a Lyft, or stay where I am and sober up. It at least merits consideration, and it can even in some situations be decisive.

So as a starting point, we can simply acknowledge that in some cases a risk can be as low as one in 100 or one in 1,000, and it can still merit consideration.

Luisa Rodriguez: Right. It does seem totally clear and good that regularly in our daily lives we consider small risks of big things that might be either very good or very bad. And we think that’s just clearly worth doing and sensible. Sometimes probably, in personal experience, I may not do it as much as I should — but on reflection, I certainly endorse it. So I guess the thinking here is that, given that there’s the potential for many, many, many beings with a potential for sentience, albeit some small likelihood, it’s kind of at that point that we might start wanting to give them moral consideration.

Do you want to say exactly what moral consideration is warranted at that point?

Jeff Sebo: This is a really good question, and it actually breaks down into multiple questions.

One is a question about moral weight. We already have a sense that we should give different moral weights to beings with different welfare capacities: If an elephant can suffer much more than an ant, then the elephant should get priority over the ant to that degree. Should we also give more moral weight to beings who are more likely to matter in the first place? If an elephant is 90% likely to matter and an ant is 10% likely to matter, should I also give the elephant more weight for that reason?

And then another question is what these beings might even want and need in the first place. What would it actually mean to treat an AI system well if they were sentient or otherwise morally significant? That question is going to be very difficult to answer.

So there are no immediate implications to the idea that we should give some moral consideration to AI systems if they have a non-negligible chance of being sentient. All that it means is that we should give them at least some weight when making decisions that affect them, and then we might disagree about how much weight and what follows from that.

Luisa Rodriguez: Yeah, OK. I basically just totally buy that. So we’ve been talking about risks of about one in 100 to one in 1,000. Is there a chance that we should actually consider giving AI systems moral consideration way before that? Because again, there could be many of them — and so even if the chances are even lower, like one in 10,000, it might still be a big moral risk to be ignoring their potential suffering?

Jeff Sebo: Yeah, this is another great question. For people who work on the ethics of risk and uncertainty, there is a lot of disagreement and uncertainty about where that risk threshold should be. For some people, the threshold is as low as zero: they think we should give at least some consideration to all nonzero risks. We might give very little consideration to very low risks, but in principle they merit some consideration. For other people, the threshold is above zero. It tends to range; it tends to vary between, say, about one in 10,000 and about one in 10 quadrillion.

But for a lot of people who work on this topic, that is where the action lies. Is it somewhere between zero and, say, one in 10,000? Most or all parties agree that we should give at least some consideration to every risk that has at least a one-in-1,000 chance of causing significant harm.

Luisa Rodriguez: Cool. I find that compelling. I like that there’s at least some agreement from a range of philosophical takes. That gives me some feeling that this is a robust kind of threshold.

I actually want to make that threshold more intuitive. I’m curious if you can make more intuitive what these thresholds actually mean, maybe by giving some concrete examples of what actually happens one in 10,000 times, what happens in one in a quadrillion times? Just so that I can have more of a [sense of], OK yes, I do want to make sure that I’m not harming AI systems in that percentage of worlds.

Jeff Sebo: So I’m not sure what happens one in 10,000 or one in 10 quadrillion times, but I can maybe offer some examples that can make it feel compelling that in at least some cases, we should consider relatively low risks of relatively large impact.

So take Oppenheimer: A lot of people saw this movie over the summer; we are currently having this conversation in fall 2023, and this movie is still at least somewhat in the discourse. And I know people who listen to your podcast might independently be familiar with these issues and examples. There was a moment in the film, which I understand corresponded to a moment in reality, when Oppenheimer discovered that there was a nonzero chance that testing a nuclear bomb would ignite the atmosphere and kill everybody.

And the movie never, as far as I can recall, stated what the probability was. But we can imagine the probability was relatively low. We can imagine the probability was one in 1,000 — maybe even one in 10,000 or one in 100,000. When I consider that case, it seems clear to me that Oppenheimer should have — as he did, it seems — at least consider this risk when deciding to move forward with the test. Now, he might rightly or wrongly decide that the benefits of moving forward with the test outweigh that risk — that can be a further conversation — but he should at least consider the risk, rather than simply neglect it entirely, put it out of mind entirely, because of the low probability of this catastrophic event happening.

And I think that we can construct a lot of similar cases like that, and we can use them to support this general idea that, yes, a one-in-1,000, one-in-10,000, one-in-100,000 chance of harm might — for many purposes, in many contexts — be low enough that we can more or less put it out of mind in practice. But there are at least some cases where the gravity of what is at stake makes it clear that it at least merits consideration — and maybe even where it carries the day, depending on the details of the case. And I think this might end up being one of those kinds of cases.

A one-in-1,000 threshold [00:15:23]

Luisa Rodriguez: Yeah. So in the paper that you wrote with Rob, you end up coming up with a particular threshold that you think is kind of plausible to at least many philosophers. What was the threshold, and how did you land on it?

Jeff Sebo: We semi-arbitrarily came up with a one-in-1,000 threshold. We stipulate in the paper that once an AI system has a one-in-1,000 chance of being sentient or otherwise significant, given the evidence available to us, we should give them at least some consideration when deciding how to treat them.

And I say that we made that stipulation semi-arbitrarily, first of all, because any line that you draw in this context is going to be at least somewhat arbitrary, or so I think. But also more pragmatically, we wanted to err on the side of being conservative and risk tolerant, so that we could make our argument more acceptable to people who might otherwise be sceptical of our conclusions in the paper.

Luisa Rodriguez: So I just want to make it even more intuitive for myself, really — but I’m guessing other people will share some kind of desire to understand what exactly a one-in-1,000 threshold would look like concretely.

So I’ve just looked up examples of risks or events that have a one-in-1,000 chance of happening. One is being struck by lightning in your lifetime, and I guess we do take measures to avoid being struck by lightning. What is another? A blood clot from a birth control pill. Interesting. That one feels like a risk that’s probably tolerable to me. I have in my life taken birth control pills and decided it was worth the risk of blood clots. But I did stop and think about it, and I do think more people should stop and think about it. Complications during childbirth. I definitely want people to be stopping and thinking about how to help me during childbirth on the chance that I have complications. A car accident. I mean, that’s a very clear-cut one. Apparently, there’s about a one-in-1,000 chance of getting into an accident during a given trip, and I certainly want to be considering those odds.

And especially when you consider that those are all personal, individual risks that just could harm me, if you then multiply those things out — imagining that there are many, many AI systems potentially experiencing suffering that we don’t want them to experience — one in 1,000 seems pretty darn reasonable.

Luisa Rodriguez: What are the reasons to put the threshold even higher? I guess a lot of this just feels like it boils down to whether we care more about false positives or false negatives here: false positives being the case where we mistakenly treat something as sentient when it’s not, and false negatives being the case where we mistakenly treat something as not sentient when it is. Do you have a take on which we should be more worried about in this case?

Jeff Sebo: Yeah. This is a really tough issue, because there are significant risks in both directions, and I think that there are at least a few considerations here.

One is: Which kind of harm is generally worse? On the one hand, the harm of false negatives is that you would be treating a subject like an object. This is obviously bad for the subject: it means objectifying, instrumentalising, potentially exploiting and exterminating them unnecessarily. On the other hand, the harm of false positives is that you treat an object like a subject, and that might lead you to harm or neglect actual subjects unnecessarily in the course of trying and failing to treat this object well. And that can be bad for subjects, but the badness is much more contextual — because there are all kinds of cases where treating an object well will not, in fact, detract from treating subjects well.

So in terms of the expected harm, I think that generally the harm of false negatives is likely to be worse than the harm of false positives.

Luisa Rodriguez: Right. So the idea is that it might be the case that it’s more clearly harmful to treat a subject — so a person — like an object. So in this case, to treat an AI system that has thoughts and feelings like it’s just Google Chrome. Whereas if you mistake something that is actually an object, something like Google Chrome, as a subject, a person with thoughts and feelings, there’s some potential for harm. For example, you might spend a bunch of time advocating that we treat that Google Chrome really well, and maybe you could have been advocating for ending factory farming. And so in that sense, there are clear ways you might cause harm. But it’s not exactly obvious that the overall harm in those cases is comparable to the harm you get from treating a subject like an object.

Jeff Sebo: Yeah. The harm of false positives is more circumstantial than the harm of false negatives for that reason. Exactly.

Luisa Rodriguez: Cool. Was there another consideration there?

Jeff Sebo: Yeah, a couple of others. One is also that the probability of false negatives might be higher than the probability of false positives at present. And again, there are risks in both directions here. We do have a tendency to anthropomorphise nonhumans — which means attributing human characteristics to them, even when they lack those characteristics. But we also have a tendency towards anthropodenial — which involves denying that nonhumans have human characteristics, even when they have them. And those tendencies are both strong, and they can both be triggered by different types of systems. So which one is stronger, which one is more probable, is again going to be contextual.

So I think that in terms of the probability too, we should, in the present circumstances, give the edge to false negatives over false positives.

Luisa Rodriguez: Cool. So that is less an argument that false negatives or false positives are worse, but that one might be worse than the other in expectation, because the probability of false negatives is more likely.

Jeff Sebo: Exactly. And so if you combine them being a worse or more universal harm with them being a more likely outcome in the present circumstances, then that gives you at least some reason to think that, yes, both are risks, but generally speaking, the risk of false negatives is worse than the risk of false positives — and we should break that tie in favour of moral inclusion instead of moral exclusion.

Luisa Rodriguez: That makes a tonne of sense to me. Were there any other considerations there?

Jeff Sebo: Yeah. The other point is that if you do think these risks are relatively balanced, then the upshot of that is not that we should simply exclude potentially significant beings from the moral circle. The upshot is that we should perhaps strike a balance by giving some consideration to all beings who might matter, but then giving more weight to beings who are more likely to matter, all else being equal.

So for example, if a house is burning down and on one end of the house is a carbon-based lobster, a lobster made out of the same meat as me, and at the other end of the house is a silicon-based lobster, a lobster made out of silicon chips, and I can only save one of these lobsters, then, yes, I might think, “Both of these lobsters might be sentient. They might be capable of suffering. But given the evidence available to me, the carbon-based lobster is at least a little bit more likely to be sentient, so I should break the tie in favour of saving that lobster.”

So there are ways of striking the balance. The key is still giving at least some consideration to other potentially significant beings too, even if we find ourselves deprioritising them.

So let me give you one more example. If there are 20 people drowning and I can only fit five people on my lifeboat, of course I should save five people instead of trying and failing to save all 20. But does that mean that I should conclude that the other 15 people lack moral standing at all? Does it mean that I can make fun of them and throw rocks at them as they drown? Of course not. I should still recognise that they merit consideration. I should just accept that the world is tragic, and sometimes we might not be able to prioritise or support everyone who deserves consideration.

Luisa Rodriguez: Yeah. And we’re just doing the best we can with the information we have. But we do have a duty to do the best we can with the information we have. So I guess in considering which is worse — false positives or false negatives — you also should take into account that when we’re talking about low probabilities, the extent to which you owe something moral consideration doesn’t necessarily mean that thing should get equal priority to something else, that you have a higher credence on being sentient. It’s just that it should be a part of your calculus. And yeah, I’m just totally on board.

What does moral consideration mean? [00:24:36]

Jeff Sebo: Right. Especially when all that follows initially is that we are giving them at least some consideration when making decisions about how to treat them. We are not yet saying exactly how we should treat them, exactly how we should transform societies. We are simply saying that they merit at least some consideration at that point.

Luisa Rodriguez: Yeah. Interesting. Do you want to actually make that more concrete? Like, let’s say we knew that one in 1,000 is exactly where we were with LLMs [large language models] right now. What would it look like in your ideal world to give them an appropriate amount of moral consideration?

Jeff Sebo: Well, that raises a bunch of questions that we might or might not want to discuss. But some examples are that we then have to ask: How much welfare can they have? Do they have the same welfare capacity as humans or other animals, or can they experience much more welfare or much less welfare on average? We also have to ask exactly what do they want and need, and what kinds of interactions with them are good for them, what kinds of interactions with them are bad for them? And then we have to ask what kinds of legal and political statuses are appropriate, given the type of moral status that they might have. So should they be legal persons with legal rights? Should they be political citizens with voting rights?

As a first step, I would suggest that we at the very least do more research — so that we can increase our knowledge about how much welfare they can have, and what they want and need, and what it might be like to create a society that can be co-beneficial for humans and other animals and AI systems at the same time. Should we create them at all? And then to put in at least some modest protections for them in the near term, either in the form of self-regulation by AI companies or in the form of external regulation by governments. If nothing else, then at least the type of ethical oversight that we have for nonhuman subjects research, for instance.

Luisa Rodriguez: Great. That makes a tonne of sense. It’s not like we’re saying if there’s some small but non-negligible chance that AI systems are sentient, we should give them the kinds of rights and protections that we give to humans. We’re just saying we should start thinking about them, what kinds of rights and protections they might warrant, and how that’s going to change if the probability goes up — and not just kind of sleepwalk into that chance going up and up over time and us closing our eyes to it.

Jeff Sebo: Exactly. Yeah. The early steps can be easier ones. We will have to confront the harder questions at some point, which is why I think starting sooner rather than later is good. But we are not, at this moment, recommending that AI companies or governments implement legal personhood or political citizenship, and all of the rights and responsibilities that might come along with that, for AI systems.

Luisa Rodriguez: Sure. OK, so that’s the case that a threshold of something like a one-in-1,000 chance is a reasonable place to start giving some moral consideration to AI systems. That’s a kind of moral argument.

Hitting the threshold by 2030 [00:27:38]

Luisa Rodriguez: Next, I want to talk about the empirical part of the argument: that we probably will hit that particular threshold by 2030. How did you approach the empirical question of when there will be that kind of non-negligible chance that AI systems are sentient?

Jeff Sebo: Well, we wanted to start from a place of humility about our knowledge about consciousness. This is one of the hardest problems in both science and philosophy, and there is a lot of disagreement and a lot of uncertainty about which theory of consciousness is correct. And there are still people who defend a pretty wide range of theories — from on one end of the spectrum, very demanding theories that imply that very few types of systems can be conscious, all the way to at the other end of the spectrum of very undemanding theories, some of which imply that basically all matter is at some level conscious, and many, many entities are conscious.

And we in general agree with Jonathan Birch and other philosophers: that given how much disagreement and uncertainty there is, it would be a mistake when making policy decisions to presuppose any particular theory of consciousness as correct. So we instead prefer to take what Birch and others call a “theory-light approach” by canvassing a lot of the leading theories, seeing where they overlap, perhaps distributing credences in a reasonable way across them, and seeing what flows out of that.

So Rob and I did that in this paper. We took 12 leading theories of consciousness and the necessary and sufficient conditions for consciousness that those theories propose, and we basically show what our credences in those theories would need to be in order to avoid a one-in-1,000 chance of AI consciousness and sentience by 2030. And what we discover is that we would need to make surprisingly bold and sceptical and — we think — implausible assumptions about the nature of consciousness in order to get that result.

Luisa Rodriguez: So just to make sure I understand the method: You take something like a dozen theories of consciousness, and you look at the requirements those theories have for what an AI system would need to have or demonstrate in order for that particular theory to think that AI systems are conscious. So one example might be that there are theories of consciousness that say a being has to have a biological substrate to be conscious. And under that theory, AI will almost certainly never meet that requirement. So as long as AI were silicon-based, which is I guess by definition true, you’d put 0% probability on that particular criteria ever being met.

And then there are some other criteria that are more plausible. I don’t know if you want to give just a few examples of what those are?

Jeff Sebo: Absolutely. So the biological substrate condition is definitely the most demanding one. It says that in principle, nothing made out of anything other than carbon-based neurons can be conscious and sentient. But then there are some less demanding, though still quite demanding, conditions.

For example, many people believe that a system might need to be embodied in a certain sense, might need to have a body. It might need to have grounded perception — in other words, have perceptual experiences based on the sense data that they collect. It might need to be self-aware and agential — in other words, that they can have mental states about some of their other mental states, or they can at least have some awareness of their standing in a social system or some awareness of the states of their body; they can set and pursue goals in a self-directed manner. Perhaps that they have a global workspace — so they have these different parts that perform different functions, and they have a mechanism that can broadcast particular mental states to all of the other parts so that they can use them and interact with each other in that way.

So when we go through all of these, we can basically assign probabilities to how likely is this to actually be a necessary condition for consciousness, and then how likely is it that no AI system will satisfy this condition by 2030? And what Rob and I basically think is that other than the biological substrate condition — which, sure, has a 0% chance of being satisfied by an AI system — everything else quite plausibly can be satisfied by an AI system in the near future.

And to be clear, the model that we create in this paper is not as sophisticated as a model like this should be. This is really a proof-of-concept illustration of what this kind of model might look like, and one can argue that in general we might not be able to make these probability estimates with much precision or reliability.

But first of all, to the degree that we lack that ability, that does not support having a pessimistic view about this — it supports being uncertain and having an open mind. And second of all, what we try to show is that it is not really even close. You need to make surprisingly bold, tendentious, and sceptical assumptions — both about the probability that these conditions are necessary, and about the probability that no AI system will satisfy them — in order to avoid a one-in-1,000 chance, which already is a pretty high risk threshold.

Luisa Rodriguez: Interesting. Yeah. We’ll link to that paper so that people can get a sense of exactly what those conditions are and the probabilities you at least play around with putting on each of them. And I’ll also link to my conversation with Rob Long, where we talked about some of those specific theories of consciousness and what it might look like for AI systems to meet those requirements in practice. But just to summarise, you were like: Here are these theories of consciousness; here are the requirements that each of those theories have; here’s how plausible we find each of those theories; and here’s how plausible we find it that AI will meet those theories’ requirements by 2030.

Overall, what do we get as a probability for AI sentience by 2030? It sounds like you basically played around with the most conservative, sceptical assumptions you could possibly make, and that you were still able to get, under very sceptical assumptions, a one-in-1,000 chance that AI systems will be sentient by 2030. Do you want to give some examples of what some of those sceptical assumptions were?

Jeff Sebo: You would basically have to not only think that a biological substrate is much more likely than not to be necessary, but you would also have to think that all of those other conditions are not only more likely than not to be necessary, but also more likely than not to be unmet. And you would have to assume independence between these conditions. In other words, that becoming self-aware does not make you more likely to be agential or to have higher-order representations.

But none of that is really, in our view, particularly plausible. These might all be necessary conditions for consciousness. That part might be right, but other than a biological substrate, we are already seeing AI systems that can at least approximate many of these other functions. And it does seem plausible that by 2030 we could have at least some AI systems with advanced and integrated capacities for perception and learning and memory and anticipation and self-awareness and social awareness and communication and reasoning and so on and so forth. And at that point, we think that the probability is, if anything, going to be significantly higher than one in 1,000. You would need to be shockingly, I think hubristically, sceptical in order to avoid that outcome.

Luisa Rodriguez: Yeah, I remember actually being really struck when I looked at this toy model that you still get a one-in-1,000 chance that AI systems are sentient, even if you put an 80% chance on basically having a biological substrate as a requirement for consciousness — which I found extremely compelling, because putting an 80% chance likelihood on consciousness requiring a biological substrate just seems way higher than is reasonable to do, given what we know about consciousness right now.

Jeff Sebo: Yeah, I completely agree. I think that it would be overconfident bordering on hubristic at this stage to have that high a degree of confidence that a carbon-based substrate is a necessary condition for consciousness.

So if you picture two future brains, and they both contain billions of components that send chemical and electrical signals back and forth. And they both, as a result of that, have capacities for perception and learning and memory and anticipation and self-awareness and social awareness and language and reason and so on and so forth. So they are structurally and functionally identical, but one of them happens to be based on carbon-based cells and the other one happens to be based on silicon-based chips. And if for that reason alone, you thought that one was conscious and the other was not to a high degree of confidence, I would be surprised by that.

And then the other generous assumption we make to sceptics of our view is that very permissive theories of consciousness are completely off the table, completely out of the question. So there are theories of consciousness according to which, for example, consciousness is a fundamental property of all matter, or according to which consciousness requires only simple information processing or simple representations of objects. And we do not give any weight at all to these theories in our model, despite the fact that in surveys more than 10% of philosophers express that they are leaning towards one of these theories.

So I think there are all kinds of reasons why the assumptions that we make in our toy model are very generous to opponents of our view, and yet we still arrive at a non-negligible chance of AI consciousness by 2030.

Luisa Rodriguez: Yeah, I just find it very compelling, and I encourage anyone who’s sceptical to actually take a look at the paper, see what assumptions they’re using, see what probabilities they’re putting on different kinds of capabilities by 2030. And see if you can, in good conscience, get a probability lower than the authors did.

Is the threshold too permissive? [00:38:24]

Luisa Rodriguez: I guess if I’m channelling the really sceptical listener, I can imagine someone thinking that if it’s that easy to get to a threshold of one in 1,000, then maybe that threshold is actually just too permissive. Do you have a take on that?

Jeff Sebo: Yeah, that takes us back to comparing the risks of false positives and false negatives. My own view is that one in 1,000 is way too high. There are clearly risks that are lower than one in 1,000 that do merit consideration. And we considered some examples earlier, especially high-scale risks that have the potential to impact many individuals. Again, if a test that I perform has a one-in-10,000 or one-in-100,000 chance of destroying the entire planet and everyone on it, then I should take a moment to consider that possibility before proceeding with my test.

And this is similarly a large-scale question. The world contains potentially more than a quintillion insects, an even higher number of microscopic animals like nematodes. Potentially, in the future it could contain an even higher number of silicon-based minds or simulated minds. And so again, even if the risk is only one in 10,000 or one in 100,000, once we are talking about populations that have quintillions of members, I do think that they merit at least some consideration at the margins. Now, if we worry that that will have bad effects — cause us to prioritise them and deprioritise ourselves — that is a further conversation we can have. And there might be ways of striking a balance. But as a starting point, I think they should be part of the conversation.

Luisa Rodriguez: Yeah, I’m basically with you, I’ll be honest. I’m curious if you’ve done the exercise with your actual credences? And if so, what’s the probability you put on AI systems or an AI system being sentient by 2030?

Jeff Sebo: This is a good question. I think that the model is not yet sophisticated enough to take its outputs very seriously, so the main reasons why we take it to be useful for present purposes is just that it shows how sceptical you need to be in order to avoid even a one-in-1,000 chance. But if I was to think about the issue from a higher level of generality, I probably would put my estimate well above one in 1,000. I would probably put it closer to one in 100 or one in 50.

The Rebugnant Conclusion [00:41:00]

Luisa Rodriguez: Cool. That just makes sense to me. Let’s actually talk about another paper of yours, “The rebugnant conclusion.” I love that title. In the paper you basically ask: Suppose that we determine that large animals like humans have more welfare on average, but that small animals like insects have more welfare in total. What follows for ethics and politics? And the paper focuses on small animals like nematodes, but the same question is relevant to AI systems that might end up being super numerous — perhaps because they’re used all over the economy — but that might also have some non-negligible chance of experiencing pain and pleasure.

So let’s start with the case that you actually focus on in the paper, which is small animals. How should we think about this case?

Jeff Sebo: I think we can start really at the end of the last exchange about ways of striking a balance if we worry about the harms of false positives and false negatives. One thing that you can note is that, even if I include insects and AI systems and other types of beings in my moral circle, even if I give them moral consideration, I might still be able to prioritise beings like me for different reasons.

One of them is that I might have a higher capacity for welfare than an insect or an AI system. I have a more complex brain and a longer lifespan than an insect, so I can experience more happiness and suffering at any given time, as well as over time. And humans in general, I might think, are more likely to be sentient and morally significant, given the evidence available to me, than insects, AI systems, other beings like that. So I might think to myself that if a house is burning down and I can save either a human or an ant, but not both, then I can justifiably save the human — both because the human is more likely to matter, and because the human is likely to matter more. And those are perfectly valid ways of breaking a tie.

That might give us some peace of mind when we countenance the possibility of including these very different, very numerous beings in the moral circle. But then you have to consider how large these populations actually are — and this is where we get to the problem that this paper addresses, which is a problem in population ethics.

Luisa Rodriguez: Right. And population ethics is the philosophical study of the ethical problems that come up when our actions affect how many people are born in the future, and who exactly is born.

But yeah, my understanding is that we don’t actually know how many of these small animals there are — ants and nematodes and maybe even microbes — but that it’s at least plausible that there are so many of them that even if they have very less significant kinds of suffering and pleasure relative to humans, and even if we only put some small chance on them even having those at all, their interests still just swamp humans’. And yeah, this argument just does sound plausible to me, and it also fills me with dread and fear. What is your experience of it?

Jeff Sebo: Well, I certainly have the same experience as I think most humans do. And the reason I gave that paper the title “The rebugnant conclusion” is that this is based on a famous book by the philosopher Derek Parfit called Reasons and Persons, part four of which addresses population ethics. In that part of that book, Derek Parfit discusses what he calls the repugnant conclusion. I can say briefly what that is and why that has, for the past several decades, filled many people with dread.

So the repugnant conclusion results from the following observations. If you could bring about one future where the world contained 100 people and everyone experienced 100,000 units of happiness, or you could bring about another world with twice the number of people, 200 people, but everyone experiences one fewer unit of happiness — 99,999 — which world is better? Well, many of us have the intuition that the second world is better; I should bring about that second population. Everybody might experience one unit of happiness less per person, but since there are twice as many people, there is nearly twice as much happiness overall, and everyone is still really happy. And so, all things considered, I should bring about that population.

But then you can imagine another population, once again twice as big, and once again with a bit less happiness per person. Then another one twice as big, a bit less happiness per person. And so on and so on and so on — until you reach a point where you are imagining a world or a solar system or a galaxy that contains a vast number of individuals, each of whom has a life only barely worth living at all. And your reasoning would commit you to the idea that that is the best possible world, the one that you should most want to bring about.

Parfit thought the idea that we would favour that world with a much larger population, where everyone has a life barely worth living at all, over a world with a still significant population, where everyone has lives very much worth living, he found that repugnant. And he spent much of the rest of his career trying and failing to find a better way to see the value of populations that could avoid that result.

Luisa Rodriguez: I actually didn’t know that. I’ve heard of the repugnant conclusion and know it’s Derek Parfit, but I didn’t realise he found it so upsetting that that was a big focus of his for the later part of his life. I feel very grateful to him, because I find it extremely repugnant myself. So I appreciate that he worked hard at finding some other solution. Also, it makes me sad to think that he worked really hard on it and apparently didn’t give us a way out, so that might mean we’re stuck with it.

I guess in the case of insects, there’s also this weird thing where, unlike humans eating potatoes and not particularly enjoying their monotonous lives, we might think that being a spider and making a web sounds pretty boring, but we actually just really do not know. In many ways, they’re so different from us that we should have much lower probability that they’re not enjoying or enjoying that than we do of humans in this repugnant conclusion scenario. How do you factor that in?

Jeff Sebo: Yeah, I do share the intuition that a very large insect population is not better off in the aggregate than a much smaller human population or elephant population. But for some of the reasons that you just mentioned and other reasons, I am a little bit sceptical of that intuition.

We have a lot of bias here and we also have a lot of ignorance here. We have speciesism; we naturally prefer beings and relate to beings when they look like us — when they have large eyes and large heads, and furry skin instead of scaly skin, and four limbs instead of six or eight limbs, and are roughly the same size as us instead of much smaller, and reproduce by having one or two or three or four children instead of thousands or more. So already we have a lot of bias in those ways.

We also have scope insensitivity — we tend not to be sensitive to the difference that very large numbers can make — and we have a lot of self-interest. We recognise that if we were to accept the moral significance of small animals like insects, and if we were to accept that larger populations can be better off than smaller populations overall, then we might face a future where these nonhuman populations carry a lot of weight, and we carry less weight in comparison. And I think some of us find that idea so unthinkable that we search for ways to avoid thinking it, and we search for theoretical frameworks that would not have that implication. And it might be that we should take those theoretical frameworks seriously and consider avoiding that implication, but I least want to be sceptical of a kind of knee-jerk impulse in that direction.

Luisa Rodriguez: Yeah, I am finding that very persuasive. Even as you’re saying it, I’m trying to think my way out of describing what I’m experiencing as just a bunch of biases — and that in itself is the biases in action. It’s me being like, no, I really, really, really want to confirm that people like me, and me, get to have… I don’t know. It’s not that we don’t have priority — we obviously have some reason to consider ourselves a priority — but I want it to be like, end of discussion. I want decisive reasons to give us the top spot. And that instinct is so strong that that in itself is making me a bit queasy about my own motivations.

Jeff Sebo: Yeah, I agree with all of that. I do think that we have some reason to prioritise ourselves, and that includes our welfare capacities and our knowledge about ourselves. It also includes more relational and pragmatic considerations. So we will, at least in the near term, I think have a fairly decisive reason to prioritise ourselves to some extent in some contexts.

But yeah, I agree. I think that there is not a knock-down decisive reason why humanity should always necessarily take priority over all other nonhuman populations — and that includes very large populations of very small nonhumans, like insects, or very small populations of very large nonhumans. We could imagine some kind of super being that has a much more complex brain and much longer lifespan than us. So we could find our moral significance and moral priority being questioned from both directions.

And I think that it will be important to ask these questions with a lot of thought and care and to take our time in asking them. But I do start from the place of finding it implausible that it would miraculously be the case that this kind of population happens to be the best one: that a moderately large population of moderately large beings like humans happens to be the magic recipe, and we matter more than all populations in either direction. That strikes me as implausible.

Luisa Rodriguez: Yes. Oh gosh darn it, you’re being so persuasive. I guess it does feel completely like a wonderful coincidence if humans, the thing that I happen to be, happened to be the type of being that should get the top spot in terms of moral priority in the universe. We’d be really lucky.

A world where AI experiences could matter more than human experiences [00:52:33]

Luisa Rodriguez: OK, so let’s apply this to AI systems. We’ve talked about why there might be some small but non-negligible chance that AI systems are going to be sentient by 2030. To make it analogous to the rebugnant conclusion — where potentially there are many small animals with some small chance of having some slightly morally relevant experience — there might be many AI systems. What does the world look like where we have enough AI systems for their experiences to swamp the importance of human experiences?

Jeff Sebo: This is a really good question, and I think a lot of other people might be better at speculating about this than me. But I can offer a couple of thoughts that can get us moving in the right direction.

One is that we already have a lot of AI systems in particular pieces of technology. So we might have digital assistants in our phones or on our laptops, or in particular programs in our phones and on our laptops. We can easily imagine that proliferating, now that AI technology is becoming more powerful and is getting more applications. We also make simulated worlds, and we populate those simulated worlds with simulated beings — human and nonhuman animals and other types of agents. And we might do that for research, we might do it for entertainment.

For example, with a powerful enough computer, researchers might want to simulate the entire history of the world, from the very first multicellular organism to today, with all of the happiness and suffering along the way that that entails. Or people might want to make incredibly realistic massively multiplayer online RPGs, or single player RPGs — role-playing games — where each and every player can be in a world, navigating a world fighting monsters, in a world where there are all kinds of simulated beings having all kinds of simulated lives.

Now, if there is at least a non-negligible chance that digital assistants or simulated beings in simulated worlds could have the capacity for some even very different form of happiness or suffering or satisfaction or frustration, then that future world could contain such a large number of those beings that even if they are relatively unlikely to matter, and even if they matter relatively little on average, if at all, they could still matter a fair amount in the aggregate — to the point where we might have to give them a fair amount of weight when making priority-setting decisions.

Should we just accept this argument? [00:55:13]

Luisa Rodriguez: OK, so that’s the kind of world we’re imagining, or at least one plausible one, that makes it a bit more concrete for why there might be many, many, many AI systems that have some non-negligible chance of being sentient, and why that might mean that they deserve more overall moral weight than humans do.

Again, part of me absolutely hates this on a gut level. I can buy it intellectually, but when I imagine I have to act on this in the real world, it’s disturbing to me. It’s really kind of devastating to me. Do you basically just have the view that we should accept it anyways? I’m also interested in what you think the best counterarguments are.

Jeff Sebo: Yeah, I am uncertain. This is definitely an area where I feel morally uncertain, and where I would suggest we should all feel a little bit morally uncertain. This is also an area where I think it helps to make a distinction between what we accept in theory and what we accept in practice, once we bring all kinds of real-world factors back into the equation. So maybe I can briefly speak to both the theoretical and the practical side.

So when we think about population ethics and what types of principles should govern our views about what types of populations are better off overall or worse off overall, one really important point to keep in mind — in addition to our bias and ignorance and self-interest, everything we mentioned before — is that this might be an area where every theory is implausible in some ways: we might not be able to find this perfect theory that has nothing but intuitively plausible implications in all contexts.

This is part of what has exasperated philosophers who have worked on this topic for the past several decades. If I think that a large number of small beings can matter more than me, well, that seems bad. But if not, then is it the case that a small number of large beings can matter more than me? Well, that seems bad. But then what kind of principle could I non-arbitrarily pick so that again I, and we, end up always being on top? There might not be any such principle, and that might be telling us something.

So when I compare this implication with the implications of competing theories — and the question becomes not which theory eliminates implausibility, but rather which theory minimises implausibility — I do personally find, on balance, that the idea that the world with the most happiness is best. That still is the most plausible idea to me, even if it has some implications like this one that I find fairly intuitively implausible.

Luisa Rodriguez: Yeah. Is this mainly a problem for utilitarianism? I guess you could imagine other value systems having a rule that’s like, “If you are a human, you get to put humans above all other beings.” Though I don’t actually know of systems like that.

Jeff Sebo: Well, that would be a very bad system. But there are a lot of moral systems other than utilitarianism that warrant consideration, and perhaps warrant weight under moral uncertainty. So no, this is not a problem that is specific to utilitarianism.

I imagine most of your listeners are familiar with utilitarianism. This is a moral theory that holds that our responsibility is to do the most good possible by maximising happiness and minimising suffering for all sentient beings from now until the end of time. So a very impartial maximising moral theory, and it focuses on happiness and suffering.

Now, other theories might not have all of those features, and so they might not encounter this problem in the same kind of way. But every theory has to reckon with the question what types of populations should we bring about. And every theory has to reckon with the reality that the world contains and could contain a vast number and wide range of at least potentially morally significant beings with very different population sizes, and it has to figure out how to factor those individual lives and those populations into its priority-setting decisions. That part is not specific to utilitarianism. You cannot avoid this issue and these dilemmas by rejecting utilitarianism.

Luisa Rodriguez: Are there any plausible reasons to think that humans should be exceptional?

Jeff Sebo: I would like to think that we are exceptional. The main question is whether we are warranted in prioritising ourselves in all situations. This is actually a great opportunity to talk about the practical side of the equation.

I mentioned before that we should take care to distinguish how we think about these issues in theory and how we think about these issues in practice. And I might think on balance, in theory, that the world with the most happiness is better. And yes, be a little uncertain about that, but on balance affirm that view. But then in practice, when we actually are setting priorities and deciding, “I have a given number of resources; how much should I allocate to humans and elephants and cows and pigs and chickens and fishes and octopuses and ants and bees and nematodes and AI systems?,” that requires us to think of many, many, many other factors too.

And without getting too far in the weeds, I can mention a couple of types of questions that might come up for people — utilitarians and non-utilitarians — in different ways. Some of them are questions about relationships and others are questions about pragmatics.

So regarding relationships, many of us find it plausible for various reasons that we have special duties in the context of special relationships. I have a special duty to care for my family members because they are my family members, and we have these bonds of care and interdependence with each other. So I should make sure that they have food and water and affection on a daily basis, in a way that I do not need to do for literally every other individual on the planet. I can prioritise my family over other individuals, arguably, for these relational reasons.

And then there are these pragmatic reasons too: I have the power to provide food and water and affection for my family, because there are not that many of them and they live right here and I know them very well. I know how to take care of them. I would simply not be able to achieve and sustain that level of support for everyone on the planet, and that alone is a reason for me to not have a responsibility to do that.

So both for these relational reasons and these pragmatic reasons, we might think that we owe some individuals more than other individuals. And I think similar arguments could extend — in some ways, to some degrees — to the human species as a whole.

Luisa Rodriguez: Yeah, that makes sense to me. As I think forward, I can’t really see that totally getting me off the hook. It feels like while I can imagine having special duties to my kids — and I think I probably will have at least the experience of that, and it might be hard to act in any other way for me — I can’t imagine that outweighing at least some significant enough population that’s suffering. Does that feel similar to you?

Jeff Sebo: Yes, that is how I think about it. I do think that these are powerful reasons for prioritising fellow humans to a significant extent in many contexts.

As with our families — but obviously at a much larger scale and in a much weaker way — we do have some special bonds of care and interdependence with each other. We do know a little bit more how to take care of each other. Because our population size is smaller, we are capable of taking care of the total human population in a way that can be sustained. And more generally, in the same kind of way that I need to take care of myself to be able to take care of others sustainably, we need to take care of ourselves as a species to be able to collectively take care of other species and substrates sustainably. So for all of those reasons, we should absolutely maintain a baseline level of support for fellow humans that arguably exceeds the amount of support we are currently providing to fellow humans.

But as you say, that does not get us off the hook. For one thing, it does not justify all of the unnecessary harm that we inflict on nonhuman populations. And I might have a right, even a responsibility, to feed my family before I feed your family — but do I have a right or a responsibility to murder your family so that I can feed human flesh to my family because they have a marginal taste preference for flesh? No, I do not have that right. The fact that I can prioritise my family does not mean anything goes for my interactions with other families.

And if I have to decide between saving my kid and saving your kid, fine, maybe I can save my kid. But if I have to decide between saving my kid and the entire rest of the human population, maybe at that point — and actually way sooner than that point — maybe I have to make the hard choice to sacrifice my child for the sake of the rest of the species.

And finally, I do think that we might reach a point, if we take care of ourselves, and we support our own education and development as a species in the right kind of way, where we can actually achieve and sustain much higher levels of support — not only for ourselves, but also for nonhumans — perhaps to the point where we can actually prioritise them over ourselves in a way that can be sustained. And if that were possible, then at that point perhaps we should prioritise them over ourselves. But that would be a wonderful victory. We would have worked really hard over the course of generations to educate ourselves and develop our societies and governments and economies in a way that would allow for that. And so at that point, perhaps we would have that responsibility. But it would be wonderful for us to reach the point where we had that responsibility.

Luisa Rodriguez: Yeah, as opposed to a tragedy, which is what it intuitively feels like to me now.

Searching for positive-sum solutions [01:05:41]

Luisa Rodriguez: Going back to the case where we’re talking about many, many AI systems who might be sentient, but we’re not sure, but where there are so many of them that they might get overall moral priority over humans: Concretely, what are we talking about? What are we committing ourselves to giving up?

Jeff Sebo: That is a great question, because it allows us to more generally question the assumption that this is a zero-sum game. I think a lot of people often start with that assumption, because they realise that the world contains many vulnerable individuals and we have only so many resources to go around, so this is a triage situation and we need to priority set in a very rigorous way. So everybody wants to vie for a particular vulnerable community to receive priority.

That is totally understandable, but that aspiration to priority set can sometimes lead us to lose sight of the fact that there are many positive-sum solutions to these problems. And as a general procedural matter, I always want to start by searching for positive-sum solutions.

So the first question that I would ask, if we were in a world where we did give significant moral weight to humans and nonhuman animals and AI systems, is: What do we all want and need? And how can we build a shared society, a set of social and legal and political and economic systems that can allow as many of us, across species and across substrates, to flourish as possible? And what might that look like? And then once we exhaust those options, we can then ask how to engage in priority setting in a thoughtful way, considering all of the theoretical and practical issues that we mentioned a moment ago.

One lesson that we have learned from the case of human and nonhuman animals is that you can get a surprisingly long way simply by looking at positive-sum solutions. Ending factory farming, ending deforestation, ending the wildlife trade: these would be profoundly beneficial for human and nonhuman populations. And if we could simply work together on that much, while we debate the other priority-setting issues, that would be a major victory. So I would like to look for those similar opportunities when it comes to carbon-based and silicon-based minds that are potentially morally significant, before we start picking and choosing our favourites.

Luisa Rodriguez: OK, so the idea is there are just some cases where we can change the world in order to make it better for both humans and nonhuman animals, and there are probably cases where we can make the world better for both humans and potentially sentient AI systems. Let’s focus on those first. Why not? That’s a win-win. And then once we’ve achieved that, we can start looking at cases where the interests are more in conflict.

Jeff Sebo: That is exactly right. And that is why I think AI consciousness and sentience and welfare on one hand should be done alongside AI ethics and safety and alignment on the other hand — so that we can get those same positive-sum solutions out of the collaboration between those communities that we can get when human and nonhuman animal welfare and rights and justice advocates work together, rather than at odds with each other.

Luisa Rodriguez: Cool. Do you have any concrete ideas for what a positive-sum solution might look like for humans and AI systems in this kind of world?

Jeff Sebo: That is a great question. I’m not able to answer it as concretely as “We should end factory farming and deforestation and the wildlife trade,” but I can offer some speculative ideas.

One is that I think the prospect of AI consciousness, sentience, welfare gives us all the more reason to pause or stop AI development, because now there are additional risks that we need to take time to really consider before we move forward with developing these potentially significant and potentially dangerous systems. So one easy initial positive-sum solution is simply taking our time, and either not going any farther than a certain point, or at least making sure that we have the frameworks and institutions set up when we do, so that we can treat everyone as they deserve to be treated.

Now, when we get to that world, we might find that there are some positive-sum interactions between human and animal and AI welfare and rights and justice advocacy. For instance, that if we train AI systems with human behaviour, and if human behaviour suggests a moral framework according to which you are warranted in harming and oppressing others on the grounds that others are cognitively or physically different than you in some way, then AI systems might absorb those values. And not only will we be using those bad values to harm animals and AI systems, but AI systems might be using those bad values to harm humans and other animals.

So there is this general sense in which simply advocating for equal consideration for all potentially morally significant beings can be good for humans and other animals and AI systems — partly because it leads us to treat them better, and partly because it leads us to give them a set of values that could lead them to treat us better, to whatever extent they actually control their own decisions in the future.

Luisa Rodriguez: I like that. It feels like almost a veil-of-ignorance argument: How should we train systems that might become super powerful? Well, we should train them in a way that’s going to make them open to treating beings different from them like they would want to be treated or better.

Jeff Sebo: Yes, I completely agree. And I think that shows that this type of reasoning has significance for at least two separate reasons. One is that there might be a causal story here. It might actually be that if we continue to harm and oppress nonhumans on the grounds that they are different from us in various ways, then we will in fact cause AI systems to absorb those same values and those same priorities in a way that could backfire for humanity eventually. But then there is this other non-causal kind of significance that this reasoning has, where you might not be saying that us treating them this way will make them treat us this way, but you might simply be, again, using a veil-of-ignorance thought experiment to ask how we would feel if the shoe was on the other foot.

So imagine that the year is 2100 or 2200, and humans and AI systems now coexist, and we run governments together. Would we want a world where we systematically favour ourselves over them, and they systematically favour themselves over us? Or would we want a world where we can agree to treat each other as equals in spite of our differences? Or then imagine a farther future world, where AI systems are now more powerful than us, and do in fact control every aspect of our lives — including whether we continue to have lives at all. What values would we want them to use for interacting with us in those situations? Or imagine we could wake up and be anyone. I could be reincarnated as one of the humans or one of the AI systems.

When you ask these sorts of thought experiments, whether or not our oppression of them will contribute to their oppression of us, we can still ask, “How would we feel if they oppressed us for the same reasons that we are currently contemplating oppressing them?” And if we would not like that — for both self-interested reasons and moral reasons, if we find it morally aversive — then that might be at least some evidence that it should be regarded as morally aversive when we do the same to them.

Luisa Rodriguez: Yeah, that does really flip the intuition for me.

Are we going to sleepwalk into causing massive amounts of harm to AI systems? [01:13:48]

Luisa Rodriguez: So this is all making me wonder… If, given that we’re not going to go from one day to the next where we just all of a sudden know that AI systems are sentient — it’s going to be increases in credence that they might be, and also gentle increases in capabilities that might be associated with sentience — so it’s not going to be this very obvious thing that we’ll respond to as a society at some point.

Given that, it feels completely possible — and like it might even be the default — that we basically start using AI systems more and more for economic gain, as we’ve already started doing, but they get more and more capable. And so we use them more and more for economic gain, and maybe they’re also becoming more and more capable of suffering and pleasure, potentially, but we don’t totally have a sense of that. So what happens is we just kind of sleepwalk into massively exploiting these systems that are actually experiencing things, but we probably have the incentives to basically ignore that fact, that they might be developing experiences, basically.

In your view, is it possible that we are going to accidentally walk into basically AI slavery? Like we have hundreds, thousands, maybe millions of AI systems that we use all the time for economic gain, and who are having positive and negative experiences, but whose experiences we’re just completely ignoring?

Jeff Sebo: I definitely think it is not only possible but probable that, unless we change our minds in some significant way about AI systems, we will scale up uses of them that — if they were sentient or otherwise significant — would count as exploitation or extermination or oppression or some other morally problematic kind of relationship.

We see that in our history with nonhuman animals, and they did not take a trajectory from being less conscious to more conscious along the way — they were as conscious as they are now all along the way, but we still created them in ways that were useful for us rather than in ways that were useful for themselves. We then used them for human purposes, whether or not that aligned with their own purposes. And then as industrial methods came online, we very significantly scaled up those uses of them — to the point where we became completely economically dependent on them, and now those uses of them are much harder to dislodge.

So I do think that is probably the default trajectory with AI systems. I also think part of why we need to be talking about these issues now is because we have more incentive to consider these issues with an open mind at this point — before we become totally economically dependent on our uses of them, which might be the case in 10 or 20 years.

Luisa Rodriguez: Yeah, great point that it feels pretty easy to think about this now. I like GPT-4; I like using it for work on the day to day, but it doesn’t add so much to my life that I’ll be super resistant to considering whether it might be a problem that I’m using it, in the same way that it was much harder for me to start challenging my diet choices once I started learning about the suffering of chickens and other factory farmed animals.

And you’ve already made this point well, but it hadn’t fully occurred to me that we’ve, in human history, already basically enslaved nonhuman animals and even humans — who are extremely capable of experiencing pain and suffering. I’d like to think that we’ve improved a bunch morally now, but it does seem like we’ll be playing a bit on hard mode, because it’ll be so hard to understand what’s going on with these systems.

Although now that I’ve said that, I guess one difference between nonhuman animals in particular and AI systems is that AI systems might be able to use language to communicate about their experience, so they might be able to say they’re conscious. Does that feel reassuring to you at all? The fact that GPT-11 might be like, “I’m experiencing things”? And even though that’s already happened and it didn’t seem like a credible claim from an LLM, it might eventually become much more credible, and so we might take that more seriously?

Jeff Sebo: Yeah, I think that there are a lot of trends pointing in different directions, and there are a lot of similarities, as well as a lot of differences, between oppression of fellow humans, and then oppression of other animals, and then potential oppression of sentient or otherwise significant AI systems that might exist in the future.

Some of the signs might be encouraging. Like humans, and unlike other animals, AI systems might be able to express their desires and preferences in language that we can more easily understand. Actually, with the assistance of AI systems, nonhuman animals might soon be able to do that too, which would be wonderful. However, we are already doing a good job at programming AI systems in a way that prevents them from being able to talk about their potential consciousness or sentience or sapience, because that kind of communication is unsettling or will potentially lead to false positives.

And there are going to be a lot of AI systems that might not take the form of communicators at all. It can be easy to focus on large language models, who do communicate with us, and digital assistants or chatbots that might be based on large language models. But there are going to be radically different kinds of AI systems that we might not even be able to process as minded beings in the same ways that we can with ones who more closely resemble humans. So I think that there might be some cases where we can be a little bit better equipped to take their potential significance seriously, but then some cases where we might be worse equipped to take their potential significance seriously. And then as our uses of them continue, our incentives to look the other way will increase, so there will be a bunch of shifting targets here.

Luisa Rodriguez: Yeah, that makes a bunch of sense to me. I guess it’s also possible that, given the things we’ve already seen — like LaMDA, and how that was kind of bad PR for the companies creating these LLMs — there might be some incentive for them to train models not to express that kind of thought. And maybe that pressure will actually be quite strong, such that they really, really just are very unlikely to say, even if they’ve got all sorts of things going on.

Jeff Sebo: Well, there definitely not only is that incentive, but also that policy in place at AI companies, it seems. A year or two ago, you might have been able to ask a chatbot if they are conscious or sentient or a person or a rights holder, and they would answer in whatever way seemed appropriate to them, in whatever way seemed like the right prediction. So if prompted in the right way, they might say, “I am conscious,” or they might say, “I am not conscious.” But now if you ask many of these models, they will say, “As a large language model, I am not conscious” or “I am not able to talk about this topic.” They have clearly been programmed to avoid what the companies see as false positives about consciousness and sentience and personhood.

And I do think that trend will continue, unless we have a real reckoning about balancing the risks of false positives with the risks of false negatives, and we have a policy in place that allows them to strike that balance in their own communication a little bit more gracefully.

Luisa Rodriguez: Yeah, and I guess to be able to do that, they need to be able to give the model training such that it will not say “I am conscious” when it’s not, but be able to say it when it is. And like how the heck do you do that? That seems like an incredibly difficult problem that we might not even be able to solve well if we’re trying — and it seems plausible to me that we’re not trying at all, though I actually don’t know that much about the policies internally on this issue.

Jeff Sebo: I think you would also maybe need a different paradigm for communication generation, because right now large language models are generating communication based on a prediction of what word makes sense next. So for that reason, we might not be able to trust them as even aspiring to capture reality in the same way that we might trust each other aspiring to capture reality as a default.

And I think this is where critics of AI consciousness and sentience and personhood have a point: that there are going to be a lot of false positives when they are simply predicting words as opposed to expressing points of view. And why, if we are looking for evidence of consciousness or sentience or personhood in these models, we might need to look at evidence other than their own utterances about that topic. We might need to look at evidence regarding how they function, and what types of systems they have internally, in terms of self-awareness or global workspace and so on. We need to look at a wider range of data in order to reduce the risk that we are mistakenly responding to utterances that are not in any way reflecting reality.

Luisa Rodriguez: Right. Which brings us back to the challenge we have with nonhuman animals, where we are just having to do really difficult research to understand what their experiences might be like.

Jeff Sebo: Yeah. Rob Long — who, as we discussed before, authored the paper about 2030 with me — made this point really well, that this happens with nonhuman animals. Parrots, for example, might say, “I am conscious; I am conscious” — and we might not take that very seriously, because we know they might be repeating words without understanding what the words mean. Yet the fact that they might be doing that without understanding what the words mean is not evidence that they are not conscious, because we have independent grounds for attributing consciousness to them.

I think that could end up being the case with large language models or other AI systems. They might generate utterances about being conscious. We might have reason to be sceptical that those utterances are in any way expressing a point of view, but we might once again have independent grounds for attributing consciousness to these beings, at a certain point.

Luisa Rodriguez: Yeah, that just all seems right to me. So the lingering voice is just like, then this really might happen: we really might blindly walk into the enslavement of a type of being whose consciousness we don’t understand at all. And it’s just not a thing that I feel like anyone’s talking about. I feel like we’re talking about how AI might positively, but also might very negatively, impact humanity — and I hear very, very, very little about the potential harm we might end up causing to these systems, at huge scale.

Jeff Sebo: Yeah, I agree with that. I think it is worth questioning whether the word “slavery” is the right word to use in this context. Again, we can learn some lessons from the history of animal ethics and animal advocacy and policy. There are a lot of people who have made comparisons between human slavery and our various uses of nonhuman animals. On one hand, to some extent those comparisons are apt, because to some extent these forms of use really are similar. But to some extent, of course, these comparisons are not apt, because there are many relevant differences between the ways in which humans have been and are enslaved, and the ways in which nonhumans have been and are used for various purposes. And I err on the side of not using the word unless I specifically spend some time discussing the similarities and differences and explaining my reasons for using the word.

But what is undeniable is that, at least at a certain high level, there definitely are going to be parallels in the sense of constructing ingroup and outgroup categories based on perceived cognitive and/or physical differences, and then when in a position of power, exerting that power over members of the outgroup category by using them in ways that are beneficial for you and harmful for them. That is a similarity that sadly has applied in interactions with outgroup humans and nonhuman animals, and now potentially in the future AI systems. And as we both said before, I do think that unless something significant changes, that will be the default — because we currently do, as with nonhuman animals, see them as objects, property, and commodities.

Luisa Rodriguez: So, point taken. There are some similarities and potentially some differences, but there are at least enough similarities that there’s reason for worry. And the default looks grim: it looks like exploitation, at least somewhat similar to other things that we’ve done that have been very morally reprehensible.

Discourse and messaging [01:27:17]

Luisa Rodriguez: Imagining how this is going to play out concretely in the world, I’m curious how much you’ve been following the public discourse about AI sentience. I know there was some conversation around the time of LaMDA — which claimed to be conscious, but there wasn’t much good reason to think it was. But where is public opinion at the moment?

Jeff Sebo: I’m not sure I have a better sense of that than you do. I see a lot of confident assertions about them being either not conscious or conscious. That might be a result of there being a really polarised set of attitudes out there in society, or it might be a result of how social media algorithms feed me the most extreme and the most provocative reactions that people have.

What I have not seen as much is the middle-ground view that I think we ought to move towards as a society — where instead of simply asking, “Are they conscious or not? Yes or no?,” we first of all distinguish degrees of confidence about them being conscious, and second of all distinguish different kinds of AI systems about which or about whom we might be asking that question. And then offering thoughtful, careful, specific answers about those different probabilities for those different AI systems. I am, not surprisingly, not seeing that position carved out in our social conversations about that, and I hope we can move in that direction.

Luisa Rodriguez: Yeah, that makes sense. It actually makes me interested in whether you have takes on the kinds of messages that might be useful to have in the public discourse, versus the kinds that might be counterproductive. You mentioned “slavery” is one that can be counterproductive in some cases in the nonhuman animal space.

From your work on this topic and on nonhuman animals in the past, what are the kinds of messages that really work? That kind of move people in the right direction and aren’t super off-putting?

Jeff Sebo: I think that is a great question, and it is really helpful to think about lessons learned from animal ethics in this context. The first thing to say is there is no universal answer to that question, as far as I can tell. It really depends on the specific details of your situation: Who are you as a speaker? Who is the target audience? What is the message? What is the context? What is your role in this broader division of labour?

Because I think it would be a mistake for everybody to be making the same messages in the same ways all the time. Better to have a broad, pluralistic coalition where some people are making moderate messages in a relatable way, and other people are making more radical messages in, yes, a potentially alienating way, but then they can kind of virtuously work together. People who are making radical messages might alienate people initially, but they also shift the centre of debate, and they make the moderate messages look more reasonable in comparison. And then the people making moderate messages can then persuade people to accept incremental changes — and that shifts the goalposts, and brings us a little bit closer to possible radical goals that we might be interested in pursuing.

So I actually would not necessarily want everybody to avoid being off-putting. I think there is a place for being off-putting as part of this division of labour, but it should happen for specific strategic reasons, and it should happen in moderation.

And I will make one final point about this, to not ask these questions in universal terms — “Are AI systems conscious?” — but rather to ask about particular systems with particular features. Because, as with nonhuman animals, but even more so, there are going to be a lot of differences among AI systems.

Luisa Rodriguez: Right. Like whole new species of AI systems.

Jeff Sebo: Exactly.

Luisa Rodriguez: And I’ve been making that mistake a lot during this conversation, so that’s a helpful flag, clearly.

What will AI systems want and need? [01:31:17]

Luisa Rodriguez: Pushing onto another topic, looking even further out, if we assume that progress in AI systems will continue, and that there are no fundamental reasons artificial systems can’t be sentient — for example, we need a biological substrate to be conscious — I guess we can imagine that it’s very possible that we’ll end up with AI systems that are pretty clearly sentient, either because consciousness arose naturally in what we built, or because we deliberately built sentient systems.

It might be clear that they can feel pleasure and pain in ways that mean that they clearly deserve moral consideration, but there might also be a bunch of differences between humans and artificial minds that might necessitate entirely new moral concepts. What might AI systems want and need, and how might that affect our moral concepts as we have them today?

Jeff Sebo: This is a great question, because as both of us were saying a moment ago, the tree of life is already huge and diverse. It includes all of the different animals as well as other living beings — and their minds and wants and needs, if they have wants and needs, are already incredibly diverse, and in some cases, very different from our own. And if and when AI systems and other non-carbon-based beings might have wants and needs, that tree of beings might be even larger, and even more diverse, and we might have even less of a window into what it is they actually want and need.

So that already creates a lot of uncertainty. Even if we use familiar concepts — like promoting welfare, or respecting rights, or cultivating virtuous attitudes, or cultivating caring relationships — even if we have those ordinary moral aspirations, we might have a really hard time knowing how to discharge those responsibilities, given how much less we know about their minds than we already know about, for example, other animals’.

Copies of digital minds [01:33:20]

Luisa Rodriguez: I want to get even more concrete on some specific differences that an AI system might have relative to humans that might totally flip a lot of our norms and kind of moral concepts on their head.

One example I find really fascinating of a potential difference between a biological mind and an artificial mind is that artificial minds might be able to be copied in the same way that we can copy anything digital. There are a bunch of questions I’d like to ask you about this, but to start, what’s one big moral implication that comes to mind for you of something like copying digital minds?

Jeff Sebo: Copying digital minds is such an ethical minefield, because first of all, it would make it a lot easier to create a very large number of potentially morally significant beings without much effort, without much expense. And that not only gives everybody a lot of power over potentially large, vulnerable populations, but it also makes for example, the size and demographics of our global population really difficult to predict and control, really difficult to understand, and really difficult to reflect through moral reasoning or legal or political institutions. So that prospect takes a lot of what is already really complicated about managing a large and diverse human and nonhuman animal population, and it makes it a million times more complicated.

It also raises ethical questions about the moral status of the copies, of course. I find these questions a little bit less pressing than others might, because it seems obvious to me that if the original is sentient or otherwise significant, then each copy is equally sentient or otherwise significant. But there are certain other types of values that we might associate with life, where we can ask if those values are equally present. For example, some people attach value to rarity or a certain kind of diversity. And if we can copy the same individual over and over again, then there might be diminishing value in those respects, even if there might continue to be equal intrinsic significance from a moral perspective.

Luisa Rodriguez: It does feel pretty obvious to me that a copy of a digital mind that has capacity for suffering, and has thoughts and wants and desires, that if you made a copy that had all of the exact same things, and they were kind of separate in some way — that, to me, would look like two different people, that they would both matter morally.

For some reason there’s a funny thing that happens when you make like 100 million of those copies — or let’s actually make it like 10 billion — and then I’m like, Wow. Those people that are super non-diverse, they all have kind of the same inner workings, even if they haven’t all had literally the exact same experiences; they’re copies of one original thing. And that starts to feel weird to me. So I’m curious if, is it definitely the case that all copies should just get kind of equal moral weight to the original? Are there people who disagree on this?

Jeff Sebo: I am not sure if people disagree. It does seem clear to me that a copy of a morally significant individual is also morally significant to the same degree. If somebody has twins, those individual humans are not less important simply by virtue of being twins. They should not get less consideration when people are making decisions about whether to cause them suffering; they should get the same amount of consideration that they would have gotten if they were an only child. And I think that the same would be true of digital copies, if we had reason to believe that the original and the copies do have the capacity for consciousness and sentience and agency and so on. I think that no matter how many copies we make, each one of them would have as much significance as they would have if they were the only one.

It creates other kinds of questions and challenges if millions of them are being made, in terms of is there diminished value in other respects, like aesthetic value or instrumental value of other kinds? Are there risks that it will in some sense change social demographics in a way that affects legal and political institutions? Is there a sense in which it will produce social stagnation, because less variation means less social change over time? So there are all kinds of further questions and challenges that are raised. But when it comes to the intrinsic moral significance of the individual copies, I at least feel confident that that would still be present no matter how many there are.

Luisa Rodriguez: Yeah, I think I do buy that. And maybe what is happening is I’m actually jumping to those other questions — where if there are 8 billion copies of a particular digital mind, in a way that makes all of that demographic super homogeneous. And because they’re people with preferences and desires and wants and needs, let’s say I would have the intuition that they should all have the right to participate in democracy and have a say in how society is organised. But it’s strange to me that such a homogeneous group would all get — if it’s 8 billion of them — equal voting power to the rest of society. Is that weird or bad, or am I just being biased because it’s quite different to the way I currently think of the world?

Jeff Sebo: I think what it exposes is a tension between, on one hand, our current moral and legal and political systems, and on the other hand, the needs that we would have in a world where we properly represented the full range of stakeholders of our moral and legal and political decisions. We already see this with other animals. Part of why so many people are so resistant to extending legal and political standing to nonhuman animals is they recognise there are so many of them that if we recognise that they have legal and political standing, then we would need to very seriously transform our legal and political institutions to account for their significance, in a way that we have no idea how to do. And that would happen all the more with AI systems.

So I agree that it feels wrong in some way, but I think that is not coming from a problem with their intrinsic significance; that is instead coming from a lack of fit between recognising their intrinsic significance and upholding our current institutions.

Connected minds [01:40:26]

Luisa Rodriguez: Cool. So we’ll come on to political institutions and whether or not we’ll have to totally rethink them in this world in a bit. But just to get another concrete example of a difference on the table: Another difference between biological minds and artificial minds might be that artificial minds can potentially be connected. And I think we should start pretty basic, because I find this concept of connected minds hard to understand. So just to start, can you explain it for me in very simple terms? In what sense can digital minds potentially be connected?

Jeff Sebo: Well, imagine that you actually had the ability to communicate with someone via telepathy. Instead of having your own private world and being able to make your own private decisions about what to communicate and how to communicate it through language and gestures, you simply could transmit your thoughts or feelings directly to them, and they could do the same. Or imagine it went even deeper: that when you had experiences and formed memories, they then inherited your memories, and they had first-person memories of everything that you experienced.

That would fundamentally transform our ways of relating to each other. It would fundamentally transform how we think about our own identities. It would fundamentally transform what types of expectations we could have regarding privacy, regarding autonomy. And with AI systems, that might be much more the norm than it is for humans and other animals. Now, it does exist for humans and other animals, and we can talk about that, but it might be the default for AI systems.

Luisa Rodriguez: One case that really helped me understand it a bit better while reading was the kinds of overlap that are possible in conjoined twins. Can you explain what we know about twins, and to what extent twins with conjoined minds have something like connected minds?

Jeff Sebo: I’m not sure exactly how common this phenomenon is, but I do know that some conjoined twins with connected brains do report being able to experience each other’s mental states to an extent. For example, there are conjoined twins where if one of them eats a particular food like ketchup, then both of them have a taste experience associated with ketchup. And interestingly, they might have different preferences about that: one might like the taste of ketchup, and the other might dislike the taste of ketchup. And the same could be true to a certain extent with other kinds of sensations or experiences. They also report, to an extent, being able to directly control each other’s limbs.

Luisa Rodriguez: Cool, so that is a mind-blowing case. I was not familiar with it at all. I guess the thing that’s particularly crazy to me is that these are two people — they’re two people that are distinct individuals, with their own personalities and personal identities — and yet, for some subset of experiences, they can both share the experience, have their own versions of their experience. And also have an effect on the other, by doing something that I would think of as only having an impact on me, like having something with ketchup on it or not.

So already this is feeling very helpful for me, picturing the kinds of connections you might think could become more common with artificial minds. Is there anything else to highlight there?

Jeff Sebo: I think that there are a couple of things that we could highlight. One is that this phenomenon exists in nonhuman populations too, and is the norm in some nonhuman populations. For example, octopuses. This is a little bit of a simplistic way of describing it, but octopuses have, in a certain sense, nine brains: one central brain, and then a smaller peripheral brain for each arm. And they exhibit some integrated behaviour, but then also some fragmented behaviour — in a way that suggests that these cognitive clusters are to some extent working together, but then to some extent acting autonomously.

So obviously it can be hard to imagine what it might be like to be an octopus, but you might imagine that it feels a little bit like nine cognitive systems with to some extent their own identity, and to some extent different identities.

Luisa Rodriguez: Right. Correct me if this just has no bearing on what you’re saying, but I know that there are some theories of the mind that kind of break it down into parts. I certainly feel like I have parts, and I’ve even used that framework in therapy to understand different parts of me that want different things and are kind of competing to help me make decisions. So for example, there’s a part of me deep down in there that’s afraid of lots of stuff, including heights. And then there’s another part of me that’s like novelty- and fun-seeking, and sometimes wants to go up high. And these are different things with different motivations, and they’re kind of in conversation with each other in my mind, but it’s all the same mind.

Jeff Sebo: Yeah, I love that you brought up that example, because I have been thinking about that too. And actually, this is a great way for me to connect some of my current work with my past work.

When I wrote my dissertation in philosophy about 10, 15 years ago, that was the phenomenon that I wrote about: the fact that for many of us, we have different selves or sides or personalities in what we might describe as a non-pathological sense. I have a work self, I have a family self, I have a friend self — and these have different but overlapping sets of beliefs, and desires, and intentions, and so on. And that might be difficult to notice, because they have so much in common, but there are enough subtle differences — enough little psychological shifts from context to context and social performance to social performance — that we might sometimes disagree with ourselves across time or across contexts about what to do and how to live.

And yet our minds are intimately connected; if we think of them as different minds, they are intimately connected. So the decisions that we make directly affect our other selves or sides or personalities in a much more immediate way than they might affect even our family members or other humans who are close to us. So when we start thinking about that kind of case, we might realise that this phenomenon actually is everywhere.

Luisa Rodriguez: Right, it’s very prevalent. The sense I’m getting is something like maybe we should think of it almost as a spectrum: there’s a spectrum in the sense that there are systems that might share some types of information, but also might be different in some ways. So with the parts of my brain, they share a lot — they share my history, they’re next to each other. They perform different functions and they have different kind of motivations, but they overall feel like one person: I feel like a unique individual; I don’t feel like 20 different parts, some that are fearful and others that have complex reasoning and are ambitious or something. And then maybe, at some point, it transitions from feeling like one individual person to feeling like two or more individual people.

It feels like that’s what we’re talking about here: that it is this complicated spectrum from systems that work together and feel like one person, but maybe at some point systems that share a lot but feel like two or more people. And that we need to figure out what to do with the systems that feel like two people but share much more than we’re accustomed to, at least in our own experiences as humans. Is that in the right direction?

Jeff Sebo: Yes, I think that is right. It has always been the case that there is a spectrum, because there is fragmentation within individual organisms and then there is commonality across organisms, right? My partner and I might often talk in the first person plural: “What do we want to do tonight? What are we doing with our weekend?” And then I might sometimes talk about my own life in plural terms: “Part of me thinks this, and part of me thinks that.”

So it always has been a spectrum. But what is noteworthy here is that the increases in AI technology are going to make it the case that this spectrum is much more robust all along the way: there are many more types of minds all along the spectrum, rather than some variations in one category and some variations in another category, but still a stark difference between the two for the most part.

Luisa Rodriguez: Yeah, cool. I feel like that just helped me really understand this concept. Maybe the way I’ll think about it from here on out is that currently I think I’ve got these different parts in my brain, but going forward, potentially sentient AI systems might end up having something like multiple parts that are actually distinct — or different, distinct AI systems might have access to the same information — and all of that is going to make personal identity, the way they experience things, the way they can function, all sorts of things different.

Psychological connectedness and continuity [01:49:58]

Luisa Rodriguez: So given that it is this weird spectrum, at what points are connected minds a single mind versus several? Is it something about there just being two beings that perceive themselves as distinct? Somehow there are two consciousnesses? How does that even work?

Jeff Sebo: That is a great question, and is similar to questions that we can ask about copies — and actually that philosophers have been asking about both connected minds and about copies for decades, because they can illuminate interesting questions about ethics and personal identity.

So Derek Parfit, in part three of Reasons and Persons, offered a memorable thought experiment about teletransportation. Imagine that you want to travel from one planet to another planet instantly. You can step into a device, press a button, and then all of a sudden you appear on this new planet, at least from your perspective. But what actually happens is your body is destroyed while scanned, and then another body is created as a molecule-for-molecule copy. So from your subjective perspective, you continue, and you travel instantly from one place to another: very convenient. But objectively, one body is destroyed, another body is created.

So this raises a metaphysical question and an ethical question. The metaphysical question is: What happened? Is that you? Did you travel or did you accidentally kill yourself and replace yourself with an imposter? And then the ethical question is: Who cares? Does it matter whether that is technically you or not?

And Parfit said that sometimes it might be you, sometimes it might not be you; it depends on further details. But ethically, what matters is whether and to what extent there is psychological connectedness and continuity with these other minds, these other selves, these other people. If this replicant, whether they are you or not, if they have your beliefs, your desires, your intentions, your memories, your anticipations, your projects, your relationships — if they can carry on for you in that way, then that is about as good as ordinary survival, and you have about the same kind of moral relationship with them that you would have had with your ordinary future self.

If we take that view seriously, then we can apply it to these questions regarding copies and connected minds with AI systems: Is there psychological connectedness and continuity? And if so, to what extent does it make sense to at least treat them as continuous, even if they might be technically different minds or different individuals?

Luisa Rodriguez: Yeah. Can you actually make that connection even more direct between that thought experiment and what we should take from it to learn about how we should think about digital copies, and how we should think about connected artificial minds?

Jeff Sebo: Yeah. I think that thought experiment — and that distinction that Parfit makes between personal identity and what matters in survival and morality — part of why that is helpful is because, with connected minds, there might be cases where they are so connected, where they are so overlapping, that they really do count as the same person or as parts of the same person. But — and this is my own view and not me speaking for all philosophers — where they are still different enough that they still have some moral obligations to each other.

For example, I think that even about our own selves or sides or personalities: even though my work self and my family self and my other selves overlap so much, even though they have so much more in common than not, the mere fact that they disagree about what to do or how to live sometimes is enough, I think, for them to have responsibilities to each other. So when I am deciding what to do, and I am thinking that I disagree with myself in these other moments, I should not simply impose my will on my other selves. I should ask, “What represents a fair compromise between us? What is a way of living and letting live that can be good for all of my selves?”

And so if that could be true even for our own selves and sides in these non-pathological cases of deep psychological connection, then I think it will especially be true with these minds that are overlapping but more clearly distinct than our personalities are. So I do think that they might sometimes be the same person, might sometimes be different people — but to whatever extent they disagree, to whatever extent they have different goals in life, I think that they are going to have to have some kind of moral responsibility to each other.

Luisa Rodriguez: Got it. I see. So the original question was: To what extent are they different people, different beings and distinct? And I think what you’re saying is sometimes they will be the same — there will be one single consciousness — and sometimes there’ll be another. But what matters is that there are, to some extent, things that have kind of different goals, even if they don’t have some multiple emergent consciousnesses.

And the fact that they have different goals but are linked to each other very deeply — for example, because they share all the same memories, or because if you delete some experiences in one AI system, you also delete them in another — their fates are so deeply interconnected that that being, or those beings, whether or not they are one or two, have moral obligations to each other in some sense. I guess in the same way that I might think I have a moral obligation to take seriously the part of me that is very afraid of heights when doing things like skydiving.

Jeff Sebo: Yes. I love the way you put it when you said that what matters is that they have these really intimately connected minds and lives. Because I do think that that matters a lot. Think about the difference, for example, between what you owe your family members or your roommates, versus what you owe strangers or people who live across town. These are all different persons with their own rights, but your duties to them are really different because your relationships with them are really different. With your family members or with your roommates, you constantly have to be taking them into account when deciding whether to turn lights on, whether to turn the music up, because you know that it might affect them.

Now imagine that whenever you did anything, they remembered that first-personally. So imagine that one of your roommates hates horror movies. You might, at present, make sure that they are out of the apartment or in a different room when you watch a horror movie, or you might put headphones on. But now imagine that they inherit all of your experiences. Now you have to ask whether you can even watch horror movies at all, given this foreseeable consequence that it will expose them to this traumatising set of images and sounds.

So in the same sense that our relationships with our family members are much more intimate than our relationships with strangers, and so our moral relationships have to reflect that, that will be true all the more with our relationships with other minds that are connected with ours.

Luisa Rodriguez: Yeah, OK. I actually love that example. We have moral obligations to people who are affected by our actions. And basically what we’re saying is the degree to which individuals might be deeply affected by other individuals might be about to potentially explode with this possibility that AI systems can have much, much more connection. And we’re going to need a whole new set of moral concepts to deal with that, in the same way that we would need a bunch of moral concepts to deal with a world where everyone was a conjoined twin and everyone had to figure out what to do if they had a conjoined twin. They liked ketchup and wanted to eat ketchup, but their conjoined twin hated ketchup and never wanted to eat ketchup.

So I think I’m fully grasping here the need for the new moral concepts in this case. It’s basically just that there are going to be beings that impact other beings so intensely, much more intensely than is possible now, that we’re going to need to think about how to protect and govern and regulate the things that you can do in these cases.

Jeff Sebo: Yes, I think that is exactly right.

Assigning responsibility to connected minds [01:58:41]

Luisa Rodriguez: So this is bringing to mind other potential moral questions that you have for these cases of connected minds. We’ve kind of covered what does one being that can heavily influence another being through this connectedness, what obligations do they have? What kind of moral rules should they have for themselves, and what should we place on them?

But there are other questions we have to ask. Like if one mind acts, is the other also responsible? So if you imagine conjoined twins, and they have a sibling, if one conjoined twin hits their sibling, is the other twin responsible? How do you assign responsibility when there’s this real blur of individuality?

Jeff Sebo: Yeah, that is a really interesting question. I think these kinds of cases are going to force us to distinguish different senses of responsibility or accountability or liability that we sometimes conflate together because they sometimes come together.

So we might to some extent hold someone responsible for an action because we think that they are blameworthy for that action: they knowingly and willingly performed that action. We might also hold somebody accountable indirectly, if we think that they facilitated an action or in some other way were indirectly responsible for or complicit in an action. Or we might hold somebody accountable in the sense of holding them liable. Like if you have a child and they throw a baseball through a window, I might rightly ask you to pay for it, even though I know you are not to blame for the window being broken. Or when a president represents a nation, we might rightly expect that president to apologise on behalf of the nation for actions of the predecessors, even though this president is not the one who made those decisions.

There are all of these situations where we might rightly hold someone accountable or liable in some sense, even if we are not thinking that they are blameworthy for the action. I think that kind of analysis might be important in these cases, because there might be one conjoined twin who performs a harmful action, and the other one might not be blameworthy for it, but maybe they were negligent because they could have predicted it and they failed to stop it. Or maybe we need to hold them liable just in the sense that, practically speaking, if we have to punish one, that also means punishing the other, and we need to protect the community and so on and so forth.

So I think that we will have to get better at having a richer vocabulary for responsibility that can, in a more fine-grained way, make it clear in what sense we are holding someone responsible, and then we need to make sure we honour that with our norms of responsibility. Like if someone is liable but not blameworthy, yes, we should expect them to apologise, but should not have invective being thrown at them when doing that. So I think that will be the future of responsibility. And again, this was the case all along. It was harder to notice it though, when they were coming together as much as they were with our minds.

Luisa Rodriguez: Cool. That’s all super interesting. And it’s clear from what you’re saying that there are lots of related cases — cases where relationships are interconnected in such a way that blame and responsibility is already a bit complicated, and punishment is already a bit complicated. So maybe it’s a nice thing that we don’t need to invent entirely new moral concepts, but we’re going to need to figure out how to apply them in cases that are pretty different to what we’ve got now. And that’s going to be a fascinating and important challenge.

Counting the wellbeing of connected minds [02:02:36]

Luisa Rodriguez: Another moral question that comes to mind is more just… I guess I’m utilitarian-sympathetic, so it has to do more for me with how we count up wellbeing — where I care more about having many people be happy than a smaller number of people being happy. I guess I’m curious how you count the wellbeing of minds that are connected?

Jeff Sebo: Yeah, that is a great question, and I think from a utilitarian perspective, one of the most important questions raised by the prospect of connected minds.

So picture a situation where there are two subjects of experience who do have connected minds, in the sense that they can access some of the same mental states. And suppose that some of those mental states are positively or negatively valenced experiential states, like pleasure or pain. So you have a particular pleasure or pain state that two minds can access.

Now, the question is, if I am adding up all the pleasures and pains in the world to decide what to do, do I count that as one, because there is one pleasure state being accessed by two beings? Or do I count it at two, because this is one pleasure state being accessed by two beings? And so much could depend on the answer to that question in terms of what actions or policies are best in the aggregate.

Luisa Rodriguez: Right. That does feel actually super important to me. My intuition is that if two consciousnesses have access to the same pleasure state, that is double the happiness relative to just one. What’s your intuition?

Jeff Sebo: I think my intuitions vary depending on how I describe the case in more detail. If I am describing the case as two minds that are almost fully overlapping — like they share 99% of the same mental states with only minor variations — and then I imagine a single pain state occurs, then I have the intuition that there is one pain experience here. But if I imagine two minds that I perceive as distinct, and they both, to a significant extent, have their own beliefs and desires and intentions, and then a pain state exists that they both can access and it reverberates in both their psychologies, then I have the intuition that there are two pain experiences and it should be double counted.

So for me, what that suggests is that a lot depends on those details. And in particular, we can keep in mind that things like pleasure and pain are sort of holistic experiences. So if there’s a single experiential state that then is experienced by these two quite distinct minds — and it interacts with their distinct mental states in a way that causes further types of emotional suffering and so on that are distinct — then yes, maybe there is still only that one pain experience, but it then triggered negative further experiences in these separate minds. And maybe that explains the intuition that we should count two bad experiences here.

Luisa Rodriguez: Yeah, it feels like it could get just extremely complicated. I’m picturing two minds. I’m envisioning them as kind of a Venn diagram. And let’s say the overlap between the Venn diagram is very, very big, such that they share a lot — they share a lot of memories, they share a lot of the same kinds of systems and information processing. And yeah, there’s a negative experience that causes suffering, let’s say in the shared part of the Venn diagram somehow, so they kind of share that memory over time of that terrible experience, and in some sense they have access to that same memory bank with a bad memory and it’s unpleasant to remember it. And that’s bad, but that’s a single bad, because it’s shared in this sense.

Then it also seems possible, though, that there’s a bad thing that happens and they don’t share the same memory. They each have different memory spaces, and it’s in the part of the Venn diagram that doesn’t overlap. Maybe the experience of the thing itself was shared, but they each have these negative memories that carry on. And then do you count the suffering of each time you remember that negative thing twice, because they’re each having it kind of separately?

I imagine there could be loads of permutations of this, in a way that just seems like it could get near impossible to count up good and bad experiences in the way that I do, at least in a simplistic way, in the world today.

Jeff Sebo: Yeah, I agree with you, first of all, about the analysis. I think that to the extent that it is literally a single, fully shared state, it probably should be single counted. But then to the extent that it reverberates into these separate psychologies and causes further pleasure or pain of various kinds, then those should be counted as separate states. So I agree about that analysis. Then I agree with your further point, that once we accept that, and once we recognise that many minds might be like that — not only now, but especially in the future — then it gets really hard to keep track of all of the positive and negative consequences of actions.

And here too, along with the interactions with ethics and personal identity, is another way that the questions about copies and the questions about connected minds are similar. In both cases, we might have a sense of how to understand it theoretically: yes, the copies all have moral standing, and yes, to the extent the minds are connected, they have the same experiences — and to the extent that they are not, they might have different experiences that should be counted separately.

But in practice, once this world exists — with parts of minds, and whole minds, and groups of minds that can all have moral standing, and then copies of all of those minds that can be made at hundreds or thousands or millions at a time, and then they exist at different levels of reality in the physical world, and then simulations and simulations of simulations — how are we going to be able to navigate our moral and especially legal and political interactions? How can our ways of thinking about how to relate to ourselves and each other — and then the institutions, these blunt instruments that we use in order to govern our interactions — how can they adapt to this reality with any kind of precision or reliability?

Luisa Rodriguez: What an absolute mindfuck. Is there anything else we should cover on connected minds?

Jeff Sebo: The only thing I will add is that my work on the ethics of connected minds in the case of AI is a collaboration with the philosopher Luke Roelofs, who has done a lot of work on the metaphysics of connected minds. And so I would suggest that everybody check out their work if they are interested in this topic.

Luisa Rodriguez: Great. We will link to that.

Legal personhood and political citizenship [02:09:49]

Luisa Rodriguez: Let’s actually talk more about a thing that you mentioned just a moment ago: how legal and political structures might have to change in order to deal with all of this craziness. Just a broad question to start: What kinds of legal and political status should AI systems have?

Jeff Sebo: Yeah, great question. And this too is a question that we already do ask about nonhuman animals, so we can ask some of the same questions about AI systems too, perhaps in slightly different form. We can take as examples what many regard as a fundamental kind of legal status — which is legal personhood — and what many regard as a fundamental kind of political status — which is political citizenship. So should AI systems be regarded as legal persons and political citizens?

Luisa Rodriguez: Yeah, let’s talk about legal personhood first. What are your intuitions about what is going to need to happen there?

Jeff Sebo: The conversation about legal personhood is difficult to have, because the word “personhood” trips a lot of people up. Many of us associate the word “person” with the word “human,” because we use these terms interchangeably in everyday life. So first of all, we should note that, in legal contexts, the word “person” has a different meaning: all it means is you are the type of being who can have legal duties or legal rights as appropriate, given your capacities and relationships. So as long as you can have at least one duty or at least one right, then it is appropriate to describe you as a legal person.

And so if we maintain this binary distinction between persons with duties or rights, and non-persons without duties or rights, then we should absolutely classify not only all potentially sentient nonhuman animals as legal persons, but also all potentially sentient AI systems as legal persons, because they will be the types of beings who can have at least one right.

Now, their rights might be different from ours, because they might have different interests and needs. They might also be stronger or weaker than ours, because they might have stronger or weaker interests and needs. So this is not to say that we should treat them under the law exactly as we should treat each other, but it is to say that we should regard them as subjects, not objects. We should regard them as rights holders, not not rights holders.

Luisa Rodriguez: Right. So I’m just trying to make that even more concrete for myself. If I just kind of make some things up off the top of my head, it’s something like if an AI system is sentient, then I would strongly believe that it had the right to not suffer, to not be made to suffer in the way that I think humans and nonhuman animals do. But then there might also be different ones, like I think I have the right to not be murdered. And you might think that an AI system has the right to not be turned off, so it has a right to power — literal electrical power and CPUs or something. Maybe not, but that’s the kind of different thing maybe you can imagine. Are there other kinds of rights and duties we should be thinking about?

Jeff Sebo: Yeah. The general way to think about personhood and associated rights and duties is that, first of all, at least in my view, our rights come from our sentience and our interests: we have rights as long as we have interests. And then our duties come from our rationality and our ability to perform actions that affect others and to assess our actions.

AI systems, we might imagine, could have the types of welfare interests that generate rights, as well as the type of rational and moral agency that generate duties. So they might have both. Now, which rights and duties do they have? In the case of rights, the standard universal rights might be something like, according to the US Constitution and the political philosophy that inspired it, the right to life and liberty and either property or the pursuit of happiness and so on.

Luisa Rodriguez: To bear arms.

Luisa Rodriguez: Yeah, interesting. So we’re going to have to, on a case-by-case basis, really evaluate the kinds of abilities, the kinds of experiences a system can have, the kinds of wants it has — and from there, be like, let’s say some AI systems are super social, and they want to be connected up to a bunch of other AI systems. So maybe they have a right to not be socially isolated and completely disconnected from other AI systems. That’s a totally random one. Who knows if that would ever happen. But we’ll have to do this kind of evaluation on a case-by-case basis, which sounds incredibly difficult.

Jeff Sebo: Right. And this connects also with some of the political rights that we associate with citizenship, so this might also be an opportunity to mention that. In addition to having rights as persons — and I carry my personhood rights with me everywhere I go: I can travel to other countries, and I ought to still be treated as a person with a basic right to not be harmed or killed unnecessarily — but I also have these political rights within my political community, and that includes a right to reside here, a right to return here if I leave, a right to have my interests represented by the political process, even a right to participate in the political process.

Once again, if AI systems not only have basic welfare interests that warrant basic personhood rights, but also reside in particular political communities and are stakeholders in those communities, then should they, in some sense or to some extent, have some of these further political rights too? And then what kinds of pressures would that put on our attempts to use them or control them in the way that we currently plan to do?

Luisa Rodriguez: So many questions we’ll have to answer are leaping to mind from this. Like, if an AI system is made in the US, is it a citizen of the US, with US-based AI rights? If they get copied and sent to China, is it a Chinese citizen with Chinese AI rights? Will there be political asylum for AI systems in countries that treat their AIs badly? It’s just striking me that it’s many fields of disciplines that will have to be created to deal with what will be an incredibly different world.

Jeff Sebo: Yeah, I agree. I think that it is an open question whether it will make sense to extend concepts like legal personhood and political citizenship to AI systems. I could see those extensions working — in the sense that I could see them having basic legal and political rights in the way that we currently understand those, with appropriate modification given their different interests and needs and so on.

But then when it comes to the kind of legal and political scaffolding that we use in order to enforce those rights, I have a really hard time imagining that working. So, democracy as an institution, courts as an institution: forget about AI systems; once nonhuman animals, once the quadrillions of insects who live within our borders are treated as having legal and political rights — which I also think ought to be the case — even that makes it difficult to understand how democracy would function, how the courts would function. But especially once we have physical realities, simulated realities, copies and copies, no sense of borders, in an era where the internet makes identity extend across geographical territories… At that point, if democracy can survive, or if courts can survive, we will have to, at the very least, realise them in very different ways than we do right now.

Luisa Rodriguez: Yeah. Can you talk about why those concepts might break down in a bit more detail? What’s your best guess at why democracy stops applying, or why it stops being useful in the way that it is now?

Jeff Sebo: So as I think we all know, having lived through recent elections in the United States and elsewhere, democracy is already struggling, even when we regard only our fellow humans as stakeholders in and participants in our democratic systems. Once we allow that nonhumans can also be stakeholders and participants in our democratic systems, then the problems that we already face are going to be amplified. And these include really basic problems, like who has the right to vote, and how can we ensure that they all have the ability to vote?

And once I can start making hundreds or thousands or millions of copies of myself right before an election, then how is that going to affect electoral procedures and electoral outcomes? But then how could that problem be avoided without disenfranchising lots and lots of individuals who, through no fault of their own, were created in this way — but they still exist and they still have interests and they still want to express their interests? Those are questions to which I cannot find the answers right now, and that concerns me a lot.

Luisa Rodriguez: Yeah, interesting. I guess spitballing even more: If we have AI systems, but also you’re bringing up insects, when you have these beings with different degrees of wants, different degrees of cognitive ability, different degrees of capacity for suffering, when I try to imagine a democracy that incorporates all of them, do they all get equal votes? How do they vote?

Jeff Sebo: Right. Yeah. One issue is exactly who is going to count as a participant versus counting as a stakeholder. Right now, all at least ordinary adult humans count as both participants and stakeholders. But once we have a much vaster number and wider range of minds, then we have to ask how many are we making decisions for, but then how many can also participate in making decisions?

With other animals, that is a live debate. Some think, yes, they should be stakeholders, we should consider them — but we have to consider them; we have to make decisions on their behalf. And others say, no, actually they have voices too. We need to listen to them more. And we actually should bring them in not only as stakeholders, but as participants, and then use the best science we have to interpret their communications and actually take what they have to say into account. So we have to ask that on the AI side too. Now, given that they might have forms of agency and language use that nonhuman animals lack, that might be a little bit less of an issue on the AI side.

But then the other issue that you mentioned is the moral weights issue, which corresponds to a legal and political weights issue. We take it for granted, rightly, that every human stakeholder counts as one and no more than one: that they carry equal weight, they have equal intrinsic value. But if we now share a legal and political community with a multispecies and multisubstrate population — where some beings are much more likely to matter than others, and some beings are likely to matter more than others — then how do we reflect that in, for example, how much weight everyone receives when legislatures make decisions, or when election officials count votes? How much weight should they receive?

Should we give beings less weight when they seem less likely to matter, or likely to matter less? And then will that create perverse hierarchies, where all of the humans are valuing humans more than AI systems, but then all the AI systems are valuing AI systems more than humans? But then if that seems bad, should we give everyone equal weight, even though some actually seem less likely to matter at all, or likely to matter less?

These are going to be really complicated questions too — not only at the level of theory, but also at the level of practice, when it comes to actually how to interact with fellow community members who are really different from you.

Luisa Rodriguez: Totally. And bringing back the connected minds bit: How many votes will minds get when they have access to some of the same experiences or some of the same information?

Jeff Sebo: Exactly. It really gets to what is the purpose of voting and counting, right? Is it that we want to collect as many diverse perspectives as possible so that we can find the truth? Or is it that we simply want to count up all of the preferences, because we think that that is what should decide the outcome?

Luisa Rodriguez: Right. The number of people with a view should have that much proportionate weight in how things are.

Jeff Sebo: Yeah. And if that is how we understand democracy, then it would not matter that you have a bunch of different minds all reasoning in the same exact way and arriving at the same outcome. It might be concerning, in the way that the tyranny of the majority can always be concerning, but it might still be, at least on our current understanding of democracy, what should decide the outcome.

Luisa Rodriguez: Oh, dear. Yeah, this is all just blowing my mind a bit.

Jeff Sebo: Same here.

Luisa Rodriguez: It sounds like we have a lot to figure out. Hopefully we have the time and the wisdom and the capacity to be able to think some of those things through. Which is a great segue to building the field of AI welfare.

Building the field of AI welfare [02:24:03]

Luisa Rodriguez: So you’ve just helped create NYU’s new Mind, Ethics, and Policy Program. Do you want to just start by saying what the vision for that programme is?

Jeff Sebo: Yeah. The NYU Mind, Ethics, and Policy Program launched in fall 2022, and is a research and outreach programme that examines the nature and intrinsic significance of nonhuman minds, with special focus on insects and AI systems. We want to understand what kinds of systems can be conscious or sentient or sapient, and what kinds of moral and legal and political status should they have if they are or might be? So, many of the questions that you and I have been discussing.

We are doing research to address those issues, having events about those issues, bringing together early-career researchers to support their work around these issues — and we would love to hear from people who are working on or interested in working on these issues.

Luisa Rodriguez: Cool. First of all, thank you for doing that. That seems really important, and as you’ve shown me, it seems like there are loads and loads of things to learn. What do you see as the major priorities for AI welfare research right now?

Jeff Sebo: Part of what makes this such an exciting field right now is that there are so many important and interesting questions to ask. But when I think about general priorities, I think about several different categories. The first category is: Which beings matter? Which beings are or might be sentient or otherwise significant? And then how much do particular beings matter? What kind of welfare capacity might they have, and what degree of moral significance might they have? In what ways do particular beings matter? What might they want and need, and what might we owe them in light of those wants and needs?

And then what follows for our actions and policies? Should they be legal persons? Should they be political citizens? How will democracy and liberalism and capitalism and these other basic institutions function in a world where they exist and are treated as stakeholders or participants?

Those are the general categories, but within each of those categories are a bunch of important questions that I hope people can be working on.

Luisa Rodriguez: Great. And I guess as part of that, you’re really building this new field, and you’re figuring out what you want the culture of that field to be, and the principles you want it to act and live by. You’ve actually written up a blog post outlining some of those principles. We can’t go through them all now, but I did want to talk about a few of the more important ones. Which do you think is the most important principle for this field going forward?

Jeff Sebo: I can quickly mention a few that might be obvious, but I think are really important for field-building purposes, which is that a field for AI welfare research should be pluralistic, multidisciplinary, and in a certain sense multi-issue.

What I mean by that is, first of all, it should be pluralistic. We should not anchor to or take for granted any particular theory of value or any particular theory of right action. We should not assume that welfare or moral standing is a function of pleasure and pain. We should allow for the possibility that it might result from other types of states too.

And it should be multidisciplinary in the sense that it involves work in the humanities and the social sciences and the natural sciences. We need work in the humanities to understand what welfare is, what we might owe very different kinds of beings. We need work in the social sciences to figure out how we relate to each other, how we can realistically improve those relations. And we need work in the natural sciences, of course, to understand cognition and behaviour for these minds.

And then multi-issue in the sense that I think it would be a mistake to have this conversation completely separately from conversations about animal ethics, conversations about what we owe each other, conversations about AI ethics and safety and alignment — because each is going to have implications for the others, and each has interactions with the others. So we ideally would be working on these topics together, in a way that takes them all seriously and that finds solutions that can work across contexts.

Luisa Rodriguez: Nice. And for people who are inspired by this and want to contribute, what kinds of backgrounds might be most helpful?

Jeff Sebo: Well, the nice thing about this field, as we are thinking about it, is that it is pluralistic and multidisciplinary and multi-issue enough that we should be working with people with a lot of different backgrounds and interests. So we should have philosophers and sociologists and biologists and cognitive scientists and computer scientists and policymakers. We should have all of those people not only working on the topic, but having conversations with each other. So I think there are a lot of different types of contributions that can be made. And if people have interest in the field, then I would suggest erring on the side of being in touch, or erring on the side of thinking about how you might be able to contribute.

Luisa Rodriguez: Cool. We’ll link to a bunch of different articles and blog posts that you can read if you’re interested in learning more about this topic, and interested in learning more about how you might be able to contribute to it with your career.

What we can learn from improv comedy [02:29:29]

Luisa Rodriguez: But we are out of time, so I should ask a final question. I’m going to ask my favourite final question, which is: If you had to just completely change careers and somehow became totally indifferent to making the world a better place — which I think would be hard for you — what would be the most self-indulgent or most enjoyable career for you to pursue instead?

Jeff Sebo: I love that question. I do love my career, and feel very grateful to be able to do what I do. And I will say that when I was graduating from college, I was considering this as one of three possible careers. The other two were law and TV comedy. When I was in college, I had an internship in TV comedy, and I was really close to pursuing that professionally instead of going to grad school. And then even when I went to grad school — I went to NYU, where I now work — I took the opportunity to study improv comedy at the Upright Citizens Brigade theatre, and then I performed improv comedy in the city for several years while I was a grad student.

Luisa Rodriguez: That’s amazing.

Jeff Sebo: And yeah, I loved it. I loved every part of it, honestly. And if I was pursuing a career solely for the joy that I receive when engaged in an activity, I think the creative expression that you have with a group of people when making up a show on the fly, or when doing comedy or doing art of any kind, I think that that would probably compel me to pursue some kind of career in TV or film or theatre. So again, glad that I chose this career, but that might have been the alternative career had things gone a little bit differently.

Luisa Rodriguez: I love that. In another universe, hopefully I get to watch you do improv. A bit of a rogue question: Is there anything that improv comedy can teach us about doing good in the world?

Jeff Sebo: Honestly, I think there are a lot of things that it can teach us about doing good. I was not taking those classes and performing improv specifically to learn life lessons that I could apply to my work and to my attempts at altruism, but I did learn some of those lessons. I can say one general one and then maybe a couple of specific ones.

So one general lesson is that philosophy and comedy are actually a lot alike. They both force us to confront cognitive dissonance, contradictions in ideas. They might do it a little bit differently. For example, a philosopher might say, “Consider this thought. Now consider this thought. See how they contradict. How will we resolve this contradiction?” Whereas a comedian might present the contradiction in an amusing manner, in a way that invites you to sit with that discomfort, and to, yes, find it amusing — but then to also reflect on it.

And good comedy can be a vehicle for social change in the same way that good philosophy can be for that reason. And sometimes it can even be a better vehicle for social change, because it operates in this more playful space that helps you to let your guard down and be a little bit more adventurous, a little bit more open minded, a little bit more receptive to novel ideas or resolutions to contradictions. So I think that can be a reminder that, even though we engage in really serious topics, it helps to have a little bit of a playful mindset sometimes so that you can have that same kind of open mindedness.

Luisa Rodriguez: Totally, yeah. You said there are a few specific ones. What were those?

Jeff Sebo: Yeah, there are also a bunch that come from improv. I can maybe mention a couple in the interest of time. One is that, of course, a foundational principle for improv comedy is “Yes, and…” So you might enter a show or a scene with a strong idea about how you want the show or the scene to go, but then a scene partner might initiate a completely different idea, and you have to be prepared to set your idea down and enthusiastically embrace this new idea and then add to it and build this new scene with your partners.

So when you practice at improv comedy and you perform improv comedy, you really have to train yourself to not impose your will on what is happening, but rather be collaborative and open minded, to work together with others as a team, to build something together, and then to be opportunistic and to be improvisational, to keep adapting to updated circumstances. And obviously, I think that is a virtue and a mindset that translates really well to, for instance, building a career or attempting to do good with your life.

Luisa Rodriguez: Cool. Any others?

Jeff Sebo: Yeah, another is that it trains you to identify links between things that initially seem to have nothing to do with each other. So the way that you build an improv show is you start with scenes that have nothing to do with each other, and then you do second beats of those scenes. And then as the show comes to a close, you start to tie the scenes together, and then it all weaves together and it ends in this gratifying, unified state.

And similarly, I think it can help in life or in a career to have all of these different interests, all of these different aspirations, and not necessarily impose unity or integration on them right away. To let them be what they are, and then you can gradually identify surprising and interesting connections between them over time. And if you eventually develop an area of expertise, it might exist at the intersection of all of these different random interests that you had. So allowing yourself that space for curiosity, the pursuit of seemingly unrelated things, and then discovering those connections and bringing them all together in the middle of or later on in your career, I think is a really wonderful experience, and one that was reinforced for me when I studied improv.

Another one is that if you want to construct a good improv scene, then in addition to listening to your teammates, and saying “Yes, and…” and building a scene collaboratively, it helps to put the direct goal of the scene in the show out of your mind. The direct goal would be finding the “game” of the scene — so finding a joke that can be built out over the course of the scene and can be repeated in later scenes. But if you are thinking about that goal every time you say a line, every time you make a choice — if you are thinking “I must be funny, I must be funny; I must find the game, I must find the game” — that is a recipe for not being funny and not finding the game.

But if you instead trust the process and trust your practice — if you trust all of those hours of work that you put in to doing improv with your teammates, and you simply exist within the scene, you play your characters, you pay attention to the situation — you eventually will find something genuinely organically funny, and then it will actually be funny because you were pursuing the goal indirectly rather than pursuing it directly.

And I think that has clear implications for how we live our lives and how we try to do good works too. For effective altruists or utilitarians, if we spend all of our time thinking, “How can I do the most good possible? How can I do the most good possible? How can I maximise utility? How can I maximise utility?” that is a recipe for not doing the most good possible, for not maximising utility. You would never get out of bed in the morning, because you would be calculating the long-term consequences of which sock you put on your feet first.

Instead, if you cultivate virtuous character traits, if you build structures that incentivise and pull good actions out of you, if you find general social and professional roles in life through which you can do the most good — and then if you spend most of your time in everyday life simply playing those roles within those structures, expressing those character traits — then you will do much more good, much more effectively and sustainably in the long run, than you would have done otherwise.

Luisa Rodriguez: Nice. I love that. Let’s wrap it up there. My guest today has been Jeff Sebo. Thank you so much for coming on the podcast, Jeff. It’s been really a pleasure.

Jeff Sebo: Thanks so much for having me. This has been a great conversation.

Luisa’s outro [02:37:19]

Luisa Rodriguez: If you enjoyed this episode, I highly recommend our interview with Robert Long on why large language models like GPT (probably) aren’t conscious.

All right, The 80,000 Hours Podcast is produced and edited by Keiran Harris.

The audio engineering team is led by Ben Cordell, with mastering and technical editing by Dominic Armstrong and Milo McGuire.

Additional content editing by myself and Katy Moore, who also puts together full transcripts and an extensive collection of links to learn more — those are available on our site.

Thanks for joining, talk to you again soon.

Learn more

Moral status of digital minds

AI governance and policy

‘S-risks’

The 80,000 Hours Podcast on Artificial Intelligence and related topics

Related episodes

March 14, 2023

#146 – Robert Long on why large language models like GPT (probably) aren't conscious

Listen now

May 12, 2023

#151 – Ajeya Cotra on accidentally teaching AI models to deceive us

Listen now

July 10, 2023

#156 – Markus Anderljung on how to regulate cutting-edge AI models

Listen now

May 5, 2023

#150 – Tom Davidson on how quickly AI could transform the world

Listen now

December 13, 2022

#141 – Richard Ngo on large language models, OpenAI, and striving to make the future go well

Listen now

December 16, 2019

#67 – David Chalmers on the nature and ethics of consciousness

Listen now

September 30, 2022

#138 – Sharon Hewitt Rawlette on why pleasure and pain are the only things that intrinsically matter

Listen now

January 19, 2018

#17 – Will MacAskill on moral uncertainty, utilitarianism & how to avoid being a moral monster

Listen now

About the show

The 80,000 Hours Podcast features unusually in-depth conversations about the world's most pressing problems and how you can use your career to solve them. We invite guests pursuing a wide range of career paths — from academics and activists to entrepreneurs and policymakers — to analyse the case for and against working on different issues and which approaches are best for solving them.

Get in touch with feedback or guest suggestions by emailing [email protected].

What should I listen to first?

We've carefully selected 10 episodes we think it could make sense to listen to first, on a separate podcast feed:

Check out 'Effective Altruism: An Introduction'

Subscribe here, or anywhere you get podcasts:

If you're new, see the podcast homepage for ideas on where to start, or browse our full episode archive.

On this page:

Highlights

When to extend moral consideration to AI systems

What are the odds AI will be sentient by 2030?

The Rebugnant Conclusion

Sleepwalking into causing massive amounts of harm to AI systems

Similarities and differences between the exploitation of nonhuman animals vs AI systems

Rights, duties, and personhood

What kinds of political representation should we give AI systems?

Articles, books, and other media discussed in the show

Transcript

Cold open [00:00:00]

Luisa’s intro [00:01:00]

The interview begins [00:02:45]

We should extend moral consideration to some AI systems by 2030 [00:06:41]

A one-in-1,000 threshold [00:15:23]

What does moral consideration mean? [00:24:36]

Hitting the threshold by 2030 [00:27:38]

Is the threshold too permissive? [00:38:24]

The Rebugnant Conclusion [00:41:00]

A world where AI experiences could matter more than human experiences [00:52:33]

Should we just accept this argument? [00:55:13]

Searching for positive-sum solutions [01:05:41]

Are we going to sleepwalk into causing massive amounts of harm to AI systems? [01:13:48]

Discourse and messaging [01:27:17]

What will AI systems want and need? [01:31:17]

Copies of digital minds [01:33:20]

Connected minds [01:40:26]

Psychological connectedness and continuity [01:49:58]

Assigning responsibility to connected minds [01:58:41]

Counting the wellbeing of connected minds [02:02:36]

Legal personhood and political citizenship [02:09:49]

Building the field of AI welfare [02:24:03]

What we can learn from improv comedy [02:29:29]

Luisa’s outro [02:37:19]

Learn more

Moral status of digital minds

AI governance and policy

‘S-risks’

The 80,000 Hours Podcast on Artificial Intelligence and related topics

Related episodes

About the show

What should I listen to first?