Gradual disempowerment

Summary
The proliferation of advanced AI systems may lead to the gradual disempowerment of humanity, even if efforts to prevent them from becoming power-seeking or scheming are successful. Humanity may be incentivised to hand over increasing amounts of control to AIs, giving them power over the economy, politics, culture, and more. Over time, humanity’s interests may be sidelined and our control over the future undermined, potentially constituting an existential catastrophe.
There’s disagreement over how serious a problem this is and how it relates to other concerns about AI alignment. It’s also unclear, if this is a genuine risk, what we could do about it. But we think it’s potentially very important, and more people should work on clarifying the issue and perhaps figuring out how to address it.
Profile depth
Exploratory
This is one of many profiles we've written to help people find the most pressing problems they can solve with their careers. Learn more about how we compare different problems and see how this problem compares to the others we've considered so far.
Table of Contents
Why might gradual disempowerment be an especially pressing problem?
Advancing technology has historically benefited humanity. The invention of fire, air conditioning, and antibiotics have all come with some downsides, but overall they’ve helped humans live healthier, happier, and more comfortable lives.
But this trend isn’t guaranteed to continue.
We’ve written about how the development of advanced AI technology poses existential risks. One prominent and particularly concerning threat model is that as AI systems get more powerful, they’ll develop interests that are not aligned with humanity. They may, unbeknownst to their creators, become power-seeking. They may intentionally deceive us about their intentions and use their superior intelligence and advanced planning capabilities to disempower humanity or drive us to extinction.
It’s possible, though, that the development of AI systems could lead to human disempowerment and extinction even if we succeed in preventing AI systems from becoming power-seeking and scheming against us.
In a recent paper, Jan Kulveit and his co-authors call this threat model gradual disempowerment. They argue for the following six claims:
- Large societal systems, such as economies and governments, tend to be roughly aligned to human interests.1
- This rough alignment of the societal systems is maintained by multiple factors, including voting systems, consumer demand signals, and the reliance on human labour and thinking.
- Societal systems that rely less on human labour and thinking — and rely more on increasingly advanced and powerful AI systems — will be less aligned with human interests.
- AI systems may indeed outcompete human labour for key roles in societal systems in part because they can more ruthlessly pursue the directions they’re given. And this may cause the systems to be even less aligned with human interests.
- If one societal system becomes misaligned with human interests, like a national economy, it may increase the chance that other systems become misaligned. Powerful economic actors have historically wielded influence over national governments, for example.
- Humans could gradually become disempowered, perhaps permanently, as AIs increasingly control societal systems and these systems become increasingly misaligned from human interests. In the extreme case, it could lead to human extinction.
Kulveit et al. discuss how AI systems could come to dominate the economy, national governments, and even culture in ways that act against humanity’s interests.
It may be hard to imagine how humans would let this happen, because in this scenario, the AI systems aren’t being actively deceptive. Instead, they follow human directions.
The trouble is that due to competitive pressures, we may find ourselves narrowly incentivised to hand over more and more control to the AI systems themselves. Some human actors — corporations, governments, or other institutions — will initially gain significant power through AI deployment, using these systems to advance their interests and missions.
Here’s how it might happen:
- First, economic and political leaders adopt AI systems that enhance their existing advantages. A financial firm deploys AI trading systems that outcompete human traders. Politicians use AI advisers to win elections and keep voters happy. These initial adopters don’t experience disempowerment — they experience success, which encourages their competitors to also adopt AI.
- As time moves on, humans have less control. Corporate boards might try to change direction against the advice of their AIs, only to find share prices plummeting because the AIs had a far better business strategy. Government officials may realise they don’t understand the AI systems running key services enough to change what they’re doing successfully.
- Only later, as AI systems become increasingly powerful, might there be signs that the systems are drifting out of alignment with human interests — not because they are trying to, but because they are advancing proxies of success that don’t quite line up with what’s actually good for people.
- In the cultural sphere, for example, media companies might deploy AI to create increasingly addictive content, reshaping human preferences. What begins as entertainment evolves into persuasion technology that can shape political outcomes, diminishing democratic control.
Once humans start losing power in these ways, they may irreversibly have less and less ability to influence the future course of events. Eventually, their needs may not be addressed at all by the most powerful global actors. In the most extreme case, the species as we know it may not survive.
Many other scenarios are possible.
There are some versions of apparent “disempowerment” that could look like a utopia: humans flourishing and happy in a society expertly managed and fundamentally controlled by benevolent AI systems. Or maybe one day, humanity will decide it’s happy to cede the future to AI systems that we consider worthy descendants.
But this risk is that humanity could “hand over” control unintentionally and in a way that few of us would endorse. We might be gradually replaced by AI systems with no conscious experiences, or the future may eventually be dominated by fierce Darwinian competition between various digital agents. That could mean the future is sapped of most value — a catastrophic loss.
We want to better understand these dynamics and risks to increase the prospects that the future goes well.
How pressing is this issue?
We feel very uncertain about how likely various gradual disempowerment scenarios are. It is difficult to disentangle the possibilities from related risks of power-seeking AI systems and questions about the moral status of digital minds, which are also hard to be certain about.
Because the area is steeped in uncertainty, it’s unclear what the best interventions are. We think more work should be done to understand this problem and its potential solutions at least — and it’s likely some people should be focusing on it.
What are the arguments against this being a pressing problem?
There are several reasons you might not think this problem is very pressing:
- You might think it will be solved by default, because if we avoid other risks from AI, advanced AI systems will help us navigate these problems.
- You might think it’s very unlikely that AI systems, if not actively scheming against us, will end up contributing to an existential catastrophe for humanity — even if there are some problems of disempowerment. This might make you think this is an issue, but not nearly as big an issue as other, more existential risks from AI.
- You might think there just aren’t good solutions to this problem.
- You might think the gradual disempowerment of humanity wouldn’t constitute an existential catastrophe. For example, perhaps it’d be good or nearly as good as other futures.
What can you do to help?
Given the relatively limited state of our knowledge on this topic, we’d guess the best way to help with this problem is likely carrying out more research to understand it better. (Read more about research skills.)
Backgrounds in philosophy, history, economics, sociology, and political science — in addition to machine learning and AI — may be particularly relevant.
You might want to work in academia, think tanks, or at nonprofit research institutions.
At some point, if we have a better understanding of threat models and potential solutions, it will likely be important to have people working in AI governance and policy who are focused on reducing these risks. So pursuing a career in AI governance, while building an understanding of this emerging area of research as well as the other major AI risks, may be a promising strategy for eventually helping to reduce the risk of gradual disempowerment.
Kulveit et al. suggest some approaches to mitigating the risk of gradual disempowerment, including:
- Measuring and monitoring
- Develop metrics to track human and AI influence in economic, cultural, and political systems
- Make plans to identify warning signs of potential disempowerment
- Preventing excessive AI influence
- Implement regulatory frameworks requiring human oversight
- Apply progressive taxation on AI-generated revenues
- Establish cultural norms supporting human agency
- Strengthening human control:
- Create more robust democratic processes
- Ensure that AI systems remain understandable to humans
- Develop AI delegates that represent human interests while remaining competitive
- System-wide alignment
- Research “ecosystem alignment” that maintains human values within complex socio-technical systems
- Develop frameworks for aligning civilisation-wide interactions between humans and AI
Key organisations in this space
We don’t know of organisations that focus fully on this particular problem. Some organisations that might do relevant research include:
You can also explore roles at other organisations that work on AI safety and policy.
Our job board features opportunities in AI safety and policy:
Learn more
- Gradual Disempowerment: Systemic Existential Risks from Incremental AI Development by Jan Kulveit et al.
- Two Types of AI Existential Risk: Decisive and Accumulative by Atoosa Kasirzadeh
- What failure looks like by Paul Christiano
- Will Humanity Choose Its Future? by Guive Assadi
- Natural Selection Favors AIs over Humans by Dan Hendrycks
- TASRA: a Taxonomy and Analysis of Societal-Scale Risks from AI by Andrew Critch and Stuart Russell
- Our interview with Carl Shulman on the economy and national security after AGI, which talks about why humanity seems likely to hand over more control to AI systems
- Our interview with Will MacAskill on AI causing a “century in a decade” — and how we’re completely unprepared
- Our article on longtermism
Read next: Explore other pressing world problems
Want to learn more about global issues we think are especially pressing? See our list of issues that are large in scale, solvable, and neglected, according to our research.
Notes and references
- There are many cases where societal systems produce outcomes that are clearly bad for many humans, such as carrying out wars or causing harmful pollution. But overall, humanity has so far been able to greatly expand its population, become richer, and extend the average life span because societal systems tend to serve human interests on net.↩