New releases - 80,000 Hours

80,000 Hours completes spin-out from Effective Ventures

Blog post by The 80,000 Hours team · Published May 28th, 2025

We’re excited to announce that 80,000 Hours has completed its spin-out from Effective Ventures (EV) and is now operating as an independent organisation. We announced this decision here in December 2023 and we’ve now concluded spinning out from our parent organisation. We’re deeply grateful to the Effective Ventures leadership and team for their support, especially during the complex transition process over the past year.

Our new structure

We’ve established two new UK entities, each with their own board:

80,000 Hours Limited — this is a nonprofit entity that houses our website, podcast, job board, one-on-one service, and our operations.
80,000 Hours Foundation — this is a registered charity that will facilitate donations and own the 80k intellectual property.

Our new boards

Board of Directors (80,000 Hours Limited):

Konstantin Sietzy — Deputy Director of Talent and Operations at UK AISI
Alex Lawsen — Senior Program Associate at Open Philanthropy and former 80,000 Hours Advising Manager
Anna Weldon — COO at the Centre for Effective Altruism (CEA) and former EV board member
Joshua Rosenberg — CEO of the Forecasting Research Institute
Emma Abele — former CEO of METR

Board of Trustees (80,000 Hours Foundation):

Susan Shi — General Counsel at EV, soon to move to CEA
Katie Hearsum — COO at Longview Philanthropy
Anna Weldon — an overlapping member of both boards

What the spin-out means for our users and next steps

Our services to users are not affected by the spin-out.

Continue reading →

Updates

Beyond human minds: The bewildering frontier of consciousness in insects, AI, and more

Podcast by The 80,000 Hours podcast team · Published May 23rd, 2025

To help orient in this really critical discussion — critical because it has massive implications for animal welfare and how we organise society — it’s worth comparing the problem of animal consciousness with the one of AI consciousness. Because in both cases there’s uncertainty, but they are very different kinds of uncertainty.

In AI, we have this uncertainty of, does the stuff matter? AI is fundamentally made out of something different; animals are fundamentally the same because we are also animals. Animals generally do not speak to us, and often fail when measured against our highly questionable standards of human intelligence; whereas AI systems now speak to us, and measured against these highly questionable criteria, are doing increasingly well.

So I think we have to understand how our psychological biases are playing into this.

— Anil Seth

What if there’s something it’s like to be a shrimp — or a chatbot?

For centuries, humans have debated the nature of consciousness, often placing ourselves at the very top. But what about the minds of others — both the animals we share this planet with and the artificial intelligences we’re creating?

We’ve pulled together clips from past conversations with researchers and philosophers who’ve spent years trying to make sense of animal consciousness, artificial sentience, and moral consideration under deep uncertainty.

You’ll hear from:

Robert Long on how we might accidentally create artificial sentience (from episode #146)
Jeff Sebo on when we should extend extend moral consideration to digital beings — and what that would even look like (#173)
Jonathan Birch on what we should learn from the cautionary tale of newborn pain, and other “edge cases” of sentience (#196)
Andrés Jiménez Zorrilla on what it’s like to be a shrimp (80k After Hours)
Meghan Barrett on challenging our assumptions about insects’ experiences (#198)
David Chalmers on why artificial consciousness is entirely possible (#67)
Holden Karnofsky on how we’ll see digital people as… people (#109)
Sébastien Moro on the surprising sophistication of fish cognition and behaviour (#205)
Bob Fischer on how to compare the moral weight of a chicken to that of a human (#182)
Cameron Meyer Shorb on the vast scale of potential wild animal suffering (#210)
Lewis Bollard on how animal advocacy has evolved in response to sentience research (#185)
Anil Seth on the neuroscientific theories of consciousness (#206)
Peter Godfrey-Smith on whether we could upload ourselves to machines (#203)
Buck Shlegeris on whether AI control strategies make humans the bad guys (#214)
Stuart Russell on the moral rights of AI systems (#80)
Will MacAskill on how to integrate digital beings into society (#213)
Carl Shulman on collaboratively sharing the world with digital minds (#191)

Audio engineering: Ben Cordell, Milo McGuire, Simon Monsour, and Dominic Armstrong
Additional content editing: Katy Moore and Milo McGuire
Transcriptions and web: Katy Moore

Continue reading →

Emergency pod: Don't believe OpenAI's "nonprofit" spin (with Tyler Whitmer)

Podcast by Robert Wiblin · Published May 15th, 2025

OpenAI’s recent announcement that its nonprofit would “retain control” of its for-profit business sounds reassuring. But this seemingly major concession, celebrated by so many, is in itself largely meaningless.

Litigator Tyler Whitmer is a coauthor of a newly published letter that describes this attempted sleight of hand and directs regulators on how to stop it.

As Tyler explains, the plan both before and after this announcement has been to convert OpenAI into a Delaware public benefit corporation (PBC) — and this alone will dramatically weaken the nonprofit’s ability to direct the business in pursuit of its charitable purpose: ensuring AGI is safe and “benefits all of humanity.”

Right now, the nonprofit directly controls the business. But were OpenAI to become a PBC, the nonprofit, rather than having its “hand on the lever,” would merely contribute to the decision of who does.

Why does this matter? Today, if OpenAI’s commercial arm were about to release an unhinged AI model that might make money but be bad for humanity, the nonprofit could directly intervene to stop it. In the proposed new structure, it likely couldn’t do much at all.

But it’s even worse than that: even if the nonprofit could select the PBC’s directors, those directors would have fundamentally different legal obligations from those of the nonprofit. A PBC director must balance public benefit with the interests of profit-driven shareholders — by default, they cannot legally prioritise public interest over profits, even if they and the controlling shareholder that appointed them want to do so.

As Tyler points out, there isn’t a single reported case of a shareholder successfully suing to enforce a PBC’s public benefit mission in the 10+ years since the Delaware PBC statute was enacted.

This extra step from the nonprofit to the PBC would also mean that the attorneys general of California and Delaware — who today are empowered to ensure the nonprofit pursues its mission — would find themselves powerless to act. These are probably not side effects but rather a Trojan horse for-profit investors are trying to slip past regulators.

Fortunately this can all be addressed — but it requires either the nonprofit board or the attorneys general of California and Delaware to promptly put their foot down and insist on watertight legal agreements that preserve OpenAI’s current governance safeguards and enforcement mechanisms.

As Tyler explains, the same arrangements that currently bind the OpenAI business have to be written into a new PBC’s certificate of incorporation — something that won’t happen by default and that powerful investors have every incentive to resist.

Without these protections, OpenAI’s new suggested structure wouldn’t “fix” anything. They would be a ruse that preserved the appearance of nonprofit control while gutting its substance.

Listen to our conversation with Tyler Whitmer to understand what’s at stake, and what the AGs and board members must do to ensure OpenAI remains committed to developing artificial general intelligence that benefits humanity rather than just investors.

This episode was originally recorded on May 13, 2025.

Video editing: Simon Monsour and Luke Monsour
Audio engineering: Ben Cordell, Milo McGuire, Simon Monsour, and Dominic Armstrong
Music: Ben Cordell
Transcriptions and web: Katy Moore

Continue reading →

Emergency pod: Did OpenAI give up, or is this just a new trap? (with Rose Chan Loui)

Podcast by Robert Wiblin · Published May 8th, 2025

When attorneys general intervene in corporate affairs, it usually means something has gone seriously wrong. In OpenAI’s case, it appears to have forced a dramatic reversal of the company’s plans to sideline its nonprofit foundation, announced in a blog post that made headlines worldwide.

The company’s sudden announcement that its nonprofit will “retain control” credits “constructive dialogue” with the attorneys general of California and Delaware — corporate-speak for what was likely a far more consequential confrontation behind closed doors. A confrontation perhaps driven by public pressure from Nobel Prize winners, past OpenAI staff, and community organisations.

But whether this change will help depends entirely on the details of implementation — details that remain worryingly vague in the company’s announcement.

Return guest Rose Chan Loui, nonprofit law expert at UCLA, sees potential in OpenAI’s new proposal, but emphasises that “control” must be carefully defined and enforced: “The words are great, but what’s going to back that up?” Without explicitly defining the nonprofit’s authority over safety decisions, the shift could be largely cosmetic.

Why have state officials taken such an interest so far? Host Rob Wiblin notes, “OpenAI was proposing that the AGs would no longer have any say over what this super momentous company might end up doing. … It was just crazy how they were suggesting that they would take all of the existing money and then pursue a completely different purpose.”

Now that they’re in the picture, the AGs have leverage to ensure the nonprofit maintains genuine control over issues of public safety as OpenAI develops increasingly powerful AI.

Rob and Rose explain three key areas where the AGs can make a huge difference to whether this plays out in the public’s best interest:

Ensuring that the contractual agreements giving the nonprofit control over the new Delaware public benefit corporation are watertight, and don’t accidentally shut the AGs out of the picture.
Insisting that a majority of board members are truly independent by prohibiting indirect as well as direct financial stakes in the business.
Insisting that the board is empowered with the money, independent staffing, and access to information which they need to do their jobs.

This episode was originally recorded on May 6, 2025.

Video editing: Simon Monsour and Luke Monsour
Audio engineering: Ben Cordell, Milo McGuire, Simon Monsour, and Dominic Armstrong
Music: Ben Cordell
Transcriptions and web: Katy Moore

Continue reading →

Reading list for understanding AI and how it could be dangerous

Blog post by The 80,000 Hours team · Published May 8th, 2025

Want to get up to speed on the state of AI development and the risks it poses? Our site provides an overview of key topics in this area, but obviously there’s a lot more to learn.

We recommend starting with the following blog posts and research papers. (Note: we don’t necessarily agree with all the claims the authors make, but still think they’re great resources.)

Key blog posts
Scaling up: how increasing inputs has made artificial intelligence more capable by Veronika Samborska at Our World in Data

The article concisely explains how AI has gotten better in recent years primarily by scaling up existing systems rather than by making more fundamental scientific advances.

How we could stumble into AI catastrophe by Holden Karnofsky on Cold Takes

Holden Karnofsky makes the case that if transformative AI is developed relatively soon, it could result in global catastrophe.

AI could defeat all of us combined by Holden Karnofsky on Cold Takes

Read this to understand why it’s plausible that AI systems could pose a threat to humanity, if they were powerful enough and it would further their goals.

Machines of loving grace — How AI could transform the world for the better by Anthropic CEO Dario Amodei

It’s important to understand why there’s enthusiasm for building powerful AI systems, despite the risks. This post from an AI company CEO paints a positive vision for powerful AI.

Continue reading →

Artificial Intelligence

#216 – Ian Dunt on why governments in Britain and elsewhere can't get anything done – and how to fix it

Podcast by Robert Wiblin · Published May 2nd, 2025

When you have a system where ministers almost never understand their portfolios, civil servants change jobs every few months, and MPs don’t grasp parliamentary procedure even after decades in office — is the problem the people, or the structure they work in?

Today’s guest, political journalist Ian Dunt, studies the systemic reasons governments succeed and fail.

And in his book How Westminster Works …and Why It Doesn’t, he argues that Britain’s government dysfunction and multi-decade failure to solve its key problems stems primarily from bad incentives and bad processes. Even brilliant, well-intentioned people are set up to fail by a long list of institutional absurdities.

For instance:

Ministerial appointments in complex areas like health or defence typically go to whoever can best shore up the prime minister’s support within their own party and prevent a leadership challenge, rather than people who have any experience at all with the area.
On average, ministers are removed after just two years, so the few who manage to learn their brief are typically gone just as they’re becoming effective. In the middle of a housing crisis, Britain went through 25 housing ministers in 25 years.
Ministers are expected to make some of their most difficult decisions by reading paper memos out of a ‘red box’ while exhausted, at home, after dinner.
Tradition demands that the country be run from a cramped Georgian townhouse: 10 Downing Street. Few staff fit and teams are split across multiple floors. Meanwhile, the country’s most powerful leaders vie to control the flow of information to and from the prime minister via ‘professionalised loitering’ outside their office.
Civil servants are paid too little to retain those with technical skills, who can earn several times as much in the private sector. For those who do want to stay, the only way to get promoted is to move departments — abandoning any area-specific knowledge they’ve accumulated.
As a result, senior civil servants handling complex policy areas have a median time in role as low as 11 months. Turnover in the Treasury has regularly been 25% annually — comparable to a McDonald’s restaurant.
MPs are chosen by local party members overwhelmingly on the basis of being ‘loyal party people,’ while the question of whether they are good at understanding or scrutinising legislation (their supposed constitutional role) simply never comes up.

The end result is that very few of the most powerful people in British politics have much idea what they’re actually doing. As Ian puts it, the country is at best run by a cadre of “amateur generalists.”

While some of these are unique British failings, many others are recurring features of governments around the world, and similar dynamics can arise in large corporations as well.

But as Ian also lays out, most of these absurdities have natural solutions, and in every case some countries have found structural solutions that help ensure decisions are made by the right people, with the information they need, and that success is rewarded.

This episode was originally recorded on January 30, 2025.

Video editing: Simon Monsour
Audio engineering: Ben Cordell, Milo McGuire, Simon Monsour, and Dominic Armstrong
Music: Ben Cordell
Camera operator: Jeremy Chevillotte
Transcriptions and web: Katy Moore

Continue reading →

Bonus: Serendipity, weird bets, & cold emails that actually work: Career advice from 16 former guests

Podcast by The 80,000 Hours podcast team · Published April 24th, 2025

How do you navigate a career path when the future of work is uncertain? How important is mentorship versus immediate impact? Is it better to focus on your strengths or on the world’s most pressing problems? Should you specialise deeply or develop a unique combination of skills?

From embracing failure to finding unlikely allies, we bring you 16 diverse perspectives from past guests who’ve found unconventional paths to impact and helped others do the same.

You’ll hear from:

Michael Webb on using AI as a career advisor and the human skills AI can’t replace (from episode #161)
Holden Karnofsky on kicking ass in whatever you do, and which weird ideas are worth betting on (#109, #110, and #158)
Chris Olah on how intersections of particular skills can be a wildly valuable niche (#108)
Michelle Hutchinson on understanding what truly motivates you (#75)
Benjamin Todd on how to make tough career decisions and deal with rejection (#71 and 80k After Hours)
Jeff Sebo on what improv comedy teaches us about doing good in the world (#173)
Spencer Greenberg on recognising toxic people who could derail your career (#183)
Dean Spears on embracing randomness and serendipity (#186)
Karen Levy on finding yourself through travel (#124)
Leah Garcés on finding common ground with unlikely allies (#99)
Hannah Ritchie on being selective about whose advice you follow (#160)
Alex Lawsen on getting good mentorship (80k After Hours)
Pardis Sabeti on prioritising physical health (#104)
Sarah Eustis-Guthrie on knowing when to pivot from your current path (#207)
Danny Hernandez on setting triggers for career decisions (#78)
Varsha Venugopal on embracing uncomfortable situations (#113)

Audio engineering: Ben Cordell, Milo McGuire, Simon Monsour, and Dominic Armstrong
Content editing: Katy Moore and Milo McGuire
Transcriptions and web: Katy Moore

Continue reading →

AI-enabled power grabs

Problem profile by Cody Fenwick · Published April 24th, 2025

Advanced AI technology may enable its creators, or others who control it, to attempt and achieve unprecedented societal power grabs. Under certain circumstances, they could use these systems to take control of whole economies, militaries, and governments.

This kind of power grab from a single person or small group would pose a major threat to the rest of humanity.

Continue reading →

[Closed] Open position: Engagement Specialist

Blog post by Bella Forristal · Published April 23rd, 2025

Why this role?

80,000 Hours provides free research and support to help people find careers tackling the world’s most pressing problems, especially mitigating risks from advanced artificial intelligence.

Since we started investing much more in growth in 2022, we’ve increased the hours that people spend engaging with our content by 6.5x, reached millions of new users across different platforms, and now have over 500,000 newsletter subscribers. We’re also the largest single source of people getting involved in the effective altruism community, according to the most recent EA Survey.

Even so, it seems like there’s considerable room to reach more people — and there are many exciting growth projects we’re unable to take on because of low capacity on our team. So, we’re looking for a new Engagement Specialist to help us ambitiously increase the amount of engagement with our advice and our impact.

We anticipate that the right person in this role could help us massively increase our readership, and lead to hundreds or thousands of additional people pursuing high-impact careers.

As some indication of what success in the role might look like, over the next couple of years you might have:

Cost-effectively deployed $5 million reaching people from our target audience.
Reached hundreds of millions of people on social media with key messages.
Partnered with some of the largest and most well-regarded YouTube channels (for instance, we have run sponsorships with Veritasium,

Continue reading →

#215 – Tom Davidson on how AI-enabled coups could allow a tiny group to seize power

Podcast by Robert Wiblin · Published April 16th, 2025

Throughout history, technological revolutions have fundamentally shifted the balance of power in society. The Industrial Revolution created conditions where democracies could dominate for the first time — as nations needed educated, informed, and empowered citizens to deploy advanced technologies and remain competitive.

Unfortunately there’s every reason to think artificial general intelligence (AGI) will reverse that trend.

In a new paper published today, Tom Davidson — senior research fellow at the Forethought Centre for AI Strategy — argues that advanced AI systems will enable unprecedented power grabs by tiny groups of people, primarily by removing the need for other human beings to participate.

Come work with us on the 80,000 Hours podcast team! We’re accepting expressions of interest for the new host and chief of staff until May 6 in order to deliver as much incredibly insightful AGI-related content as we can. Learn more about our shift in strategic direction and apply soon!

When a country’s leaders no longer need citizens for economic production, or to serve in the military, there’s much less need to share power with them. “Over the broad span of history, democracy is more the exception than the rule,” Tom points out. “With AI, it will no longer be important to a country’s competitiveness to have an empowered and healthy citizenship.”

Citizens in established democracies are not typically that concerned about coups. We doubt anyone will try, and if they do, we expect human soldiers to refuse to join in. Unfortunately, the AI-controlled military systems of the future will lack those inhibitions. As Tom lays out, “Human armies today are very reluctant to fire on their civilians. If we get instruction-following AIs, then those military systems will just fire.”

Why would AI systems follow the instructions of a would-be tyrant? One answer is that, as militaries worldwide race to incorporate AI to remain competitive, they risk leaving the door open for exploitation by malicious actors in a few ways:

AI systems could be programmed to simply follow orders from the top of the chain of command, without any checks on that power — potentially handing total power indefinitely to any leader willing to abuse that authority.
Systems could contain “secret loyalties” inserted during development that activate at critical moments, as demonstrated in Anthropic’s recent paper on “sleeper agents”.
Superior cyber capabilities could enable small groups to hack into and take full control of AI-operated military infrastructure.

It’s also possible that the companies with the most advanced AI, if it conveyed a significant enough advantage over competitors, could quickly develop armed forces sufficient to overthrow an incumbent regime. History suggests that as few as 10,000 obedient military drones could be sufficient to kill competitors, take control of key centres of power, and make your success fait accompli.

Without active effort spent mitigating risks like these, it’s reasonable to fear that AI systems will destabilise the current equilibrium that enables the broad distribution of power we see in democratic nations.

In this episode, host Rob Wiblin and Tom discuss new research on the question of whether AI-enabled coups are likely, and what we can do about it if they are, as well as:

Whether preventing coups and preventing ‘rogue AI’ require opposite interventions, leaving us in a bind
Whether open sourcing AI weights could be helpful, rather than harmful, for advancing AI safely
Why risks of AI-enabled coups have been relatively neglected in AI safety discussions
How persuasive AGI will really be
How many years we have before these risks become acute
The minimum number of military robots needed to stage a coup

This episode was originally recorded on January 20, 2025.

Video editing: Simon Monsour
Audio engineering: Ben Cordell, Milo McGuire, Simon Monsour, and Dominic Armstrong
Camera operator: Jeremy Chevillotte
Transcriptions and web: Katy Moore

Continue reading →

Expression of interest: Shortform Video Editing Contractor

Blog post by Chana Messinger · Published April 15th, 2025

Help make spectacular videos that reach a huge audience.

We’re looking for someone to contract as a video editor, who can quickly learn our style and make our videos successful on shortform video platforms. We want these videos to start changing and informing the conversation about transformative AI and AGI.

Why this role?

In 2025, 80,000 Hours is planning to focus especially on helping explain why and how our audience can help society safely navigate a transition to a world with transformative AI. Right now not nearly enough people are talking about these ideas and their implications.

A great video program could change this. Time spent on the internet is increasingly spent watching video, and for many people in our target audience, video is the main way that they both find entertainment and learn about topics that matter to them.

To get our video program off the ground, we need great editors who understand our style and vision and can work quickly and to a high standard.

Responsibilities

Be able to work at least 10 hours a week
Be able to turn around drafts of edited shortform videos in 24-48 hours
Take feedback well
Learn our style and adapt to it quickly

About you

We’re looking for someone who ideally has:

Experience making shortform videos
Experience with Capcut, Descript or similar
Good taste in shortform video
Knowledge of the current trends and what succeeds in shortform video
The ability to work quickly and take feedback well
Familiarity with AI Safety

If you don’t have experience here but think you’d be a great fit,

Continue reading →

Should you quit your job — and work on risks from AI?

Blog post by Benjamin Todd · Published April 11th, 2025

Within 5 years, there’s a real chance that AI systems will be created that cause explosive technological and economic change. This would increase the risk of disasters like war between US and China, concentration of power in a small minority, or even total loss of human control over the future.

Many people — with a diverse range of skills and experience — are urgently needed to help mitigate these risks.

I think you should consider making this the focus of your career.

This article explains why.

1) World-changing AI systems could come much sooner than people expect

In an earlier article I explained why there’s a significant chance that AI could contribute to scientific research or automate many jobs by 2030. Current systems can already do a lot, there are clear ways to continue to improve them in the coming years. Forecasters and experts widely agree that the probability of widespread disruption is much higher than it was even just a couple of years ago.

AI systems are rapidly becoming more autonomous, as measured by the METR time horizon benchmark. The most recent models, such as o3, seem to be on an even faster trend that started in 2024.

2) The impact on society could be explosive

People say AI will be transformative, but few really get just how wild it could be.

Continue reading →

Career advice & strategy

Bonus: Guilt, imposter syndrome, and doing good: 16 past guests share their mental health journeys

Podcast by The 80,000 Hours podcast team · Published April 11th, 2025

What happens when your desire to do good starts to undermine your own wellbeing?

Over the years, we’ve heard from therapists, charity directors, researchers, psychologists, and career advisors — all wrestling with how to do good without falling apart. Today’s episode brings together insights from 16 past guests on the emotional and psychological costs of pursuing a high-impact career to improve the world — and how to best navigate the all-too-common guilt, burnout, perfectionism, and imposter syndrome along the way.

You’ll hear from:

80,000 Hours’ former CEO on managing anxiety, self-doubt, and a chronic sense of falling short (from episode #100)
Randy Nesse on why we evolved to be anxious and depressed (episode #179)
Hannah Boettcher on how ‘optimisation framing’ can quietly distort our sense of self-worth (from our 80k After Hours feed)
Luisa Rodriguez on grieving the gap between who you are and who you wish you were (from our 80k After Hours feed)
Cameron Meyer Shorb on how guilt and shame became his biggest source of suffering — and what helped (episode #210)
Tim LeBon on the trap of moral perfectionism, and why we should strive for excellence instead (episode #149)
Cal Newport on why we need to make time to be alone with our thoughts (episode #106)
Michelle Hutchinson and Habiba Islam on when to prioritise wellbeing over impact (episode #122)
Sarah Eustis-Guthrie on the emotional weight of founding a charity (episode #207)
Hannah Ritchie on feeling like an imposter, even after writing a book and giving a TED Talk (episode #160)
Will MacAskill on why he’s five to 10 times happier than he used to be after learning to work in a way that’s genuinely sustainable (episode #130)
Ajeya Cotra on handling the pressure of high-stakes research (episode #90)
Christian Ruhl on pursuing a high-impact career while managing a stutter (from our 80k After Hours feed)
Leah Garcés on insisting on self-care when witnessing trauma regularly (episode #99)
Kelsey Piper on recognising that you’re not alone in your struggles (episode #53)

And if you’re dealing with your own mental health concerns, here are some resources that might help:

If you’re feeling at risk, try this for the the UK: How to get help in a crisis, and this for the US: National Suicide Prevention Lifeline
The UK’s National Health Service publishes useful, evidence-based advice on treatments for most conditions.
Mental Health Navigator is a service that simplifies finding and accessing mental health information and resources all over the world — built specifically for the effective altruism community
We recommend this summary of treatments for depression, this summary of treatments for anxiety, and Mind Ease, an app created by Spencer Greenberg.
We’d also recommend It’s Not Always Depression by Hilary Hendel.
Some on our team have found Overcoming Perfectionism and Overcoming Low Self-Esteem very helpful.
And there’s even more resources listed on these episode pages:

Audio engineering: Ben Cordell, Milo McGuire, Simon Monsour, and Dominic Armstrong
Content editing: Katy Moore and Milo McGuire
Transcriptions and web: Katy Moore

Continue reading →

#214 – Buck Shlegeris on controlling AI that wants to take over – so we can use it anyway

Podcast by Robert Wiblin · Published April 4th, 2025

Most AI safety conversations centre on alignment: ensuring AI systems share our values and goals. But despite progress, we’re unlikely to know we’ve solved the problem before the arrival of human-level and superhuman systems in as little as three years.

So some are developing a backup plan to safely deploy models we fear are actively scheming to harm us — so-called “AI control.” While this may sound mad, given the reluctance of AI companies to delay deploying anything they train, not developing such techniques is probably even crazier.

Today’s guest — Buck Shlegeris, CEO of Redwood Research — has spent the last few years developing control mechanisms, and for human-level systems they’re more plausible than you might think. He argues that given companies’ unwillingness to incur large costs for security, accepting the possibility of misalignment and designing robust safeguards might be one of our best remaining options.

Buck asks us to picture a scenario where, in the relatively near future, AI companies are employing 100,000 AI systems running 16 times faster than humans to automate AI research itself. These systems would need dangerous permissions: the ability to run experiments, access model weights, and push code changes. As a result, a misaligned system could attempt to hack the data centre, exfiltrate weights, or sabotage research. In such a world, misalignment among these AIs could be very dangerous.

But in the absence of a method for reliably aligning frontier AIs, Buck argues for implementing practical safeguards to prevent catastrophic outcomes. His team has been developing and testing a range of straightforward, cheap techniques to detect and prevent risky behaviour by AIs — such as auditing AI actions with dumber but trusted models, replacing suspicious actions, and asking questions repeatedly to catch randomised attempts at deception.

Most importantly, these methods are designed to be cheap and shovel-ready. AI control focuses on harm reduction using practical techniques — techniques that don’t require new, fundamental breakthroughs before companies could reasonably implement them, and that don’t ask us to forgo the benefits of deploying AI.

As Buck puts it:

Five years ago I thought of misalignment risk from AIs as a really hard problem that you’d need some really galaxy-brained fundamental insights to resolve. Whereas now, to me the situation feels a lot more like we just really know a list of 40 things where, if you did them — none of which seem that hard — you’d probably be able to not have very much of your problem.

Of course, even if Buck is right, we still need to do those 40 things — which he points out we’re not on track for. And AI control agendas have their limitations: they aren’t likely to work once AI systems are much more capable than humans, since greatly superhuman AIs can probably work around whatever limitations we impose.

Still, AI control agendas seem to be gaining traction within AI safety. Buck and host Rob Wiblin discuss all of the above, plus:

Why he’s more worried about AI hacking its own data centre than escaping
What to do about “chronic harm,” where AI systems subtly underperform or sabotage important work like alignment research
Why he might want to use a model he thought could be conspiring against him
Why he would feel safer if he caught an AI attempting to escape
Why many control techniques would be relatively inexpensive
How to use an untrusted model to monitor another untrusted model
What the minimum viable intervention in a “lazy” AI company might look like
How even small teams of safety-focused staff within AI labs could matter
The moral considerations around controlling potentially conscious AI systems, and whether it’s justified

This episode was originally recorded on February 21, 2025.

Video: Simon Monsour and Luke Monsour
Audio engineering: Ben Cordell, Milo McGuire, and Dominic Armstrong
Transcriptions and web: Katy Moore

Continue reading →

To understand AI, you should use it. Here’s how to get started.

Blog post by Peter Hartree · Published April 4th, 2025

To truly understand what AI can do — and what is coming soon — you should make regular use of the latest AI services.

This article will help you learn how to do this.

1. You have a team of experts

Think of ChatGPT as a team of experts, ready to assist you.

Whenever you have a question, task, or problem, consider who you’d want in the room if you could talk to — or hire — anyone in the world.

Here are some of the experts I work with on a regular basis:

Programmer: write, debug, and explain code. I’m an experienced software developer, yet more than 95% of the code I ship is now written by AI.
Manager: plan and debug my days, weeks, and months. There’s an example transcript in section four below.
Product designer: ideate, create prototypes, design and critique user interfaces, and analyse user interview transcripts.
Writer and editor: give feedback on my writing, rewrite things, write entire drafts based on a rough dictation, proofread, and fix Markdown.
Data analyst: write SQL queries, analyse spreadsheets, create graphs, and infographics.
Language tutor: do written exercises and have spoken conversations in French and Icelandic.
Intern: extract data from text, get key quotes from an interview, resize a folder of images, merge a spreadsheet, and write an SQL query.
Handyman: answer household questions,

Continue reading →

Gradual disempowerment

Problem profile by Cody Fenwick · Published April 4th, 2025

The proliferation of advanced AI systems may lead to the gradual disempowerment of humanity, even if efforts to prevent them from becoming power-seeking or scheming are successful. Humanity may be incentivised to hand over increasing amounts of control to AIs, giving them power over the economy, politics, culture, and more. Over time, humanity’s interests may be sidelined and our control over the future undermined, potentially constituting an existential catastrophe.

There’s disagreement over how serious a problem this is and how it relates to other concerns about AI alignment. It’s also unclear, if this is a genuine risk, what we could do about it. But we think it’s potentially very important, and more people should work on clarifying the issue and perhaps figuring out how to address it.

Continue reading →

Artificial Intelligence

We’re shifting our strategic approach to focus more on AGI

Blog post by Niel Bowerman and the 80,000 Hours team · Published April 4th, 2025

Why we’re updating our strategic direction

Since 2016, we’ve ranked ‘risks from artificial intelligence’ as our top pressing problem. Whilst we’ve provided research and support on how to work on reducing AI risks since that point (and before!), we’ve put in varying amounts of investment over time and between programmes.

We think we should consolidate our effort and focus because:

We think that AGI by 2030 is plausible — and this is much sooner than most of us would have predicted five years ago. This is far from guaranteed, but we think the view is compelling based on analysis of the current flow of inputs into AI development and the speed of recent AI progress. You can read about this argument in far more detail in this new article.
We are in a window of opportunity to influence AGI, before laws and norms are set in place.
80,000 Hours has an opportunity to help more people take advantage of this window. We want our strategy to be responsive to changing events in the world, and we think that prioritising reducing risks from AI is probably the best way to achieve our high-level, cause-impartial goal of doing the most good for others over the long term by helping people have high-impact careers. We expect the landscape to move faster in the coming years, so we’ll need a faster moving culture to keep up.

Continue reading →

Updates

Bonus: 15 expert takes on infosec in the age of AI

Podcast by The 80,000 Hours podcast team · Published March 28th, 2025

What happens when a USB cable can secretly control your system? Are we hurtling toward a security nightmare as critical infrastructure connects to the internet? Is it possible to secure AI model weights from sophisticated attackers? And could AI might actually make computer security better rather than worse?

With AI security concerns becoming increasingly urgent, we bring you insights from 15 top experts across information security, AI safety, and governance, examining the challenges of protecting our most powerful AI models and digital infrastructure — including a sneak peek from an episode that hasn’t yet been released with Tom Davidson, where he explains how we should be more worried about “secret loyalties” in AI agents.

You’ll hear:

Holden Karnofsky on why every good future relies on strong infosec, and how hard it’s been to hire security experts (from episode #158)
Tantum Collins on why infosec might be the rare issue everyone agrees on (episode #166)
Nick Joseph on whether AI companies can develop frontier models safely with the current state of information security (episode #197)
Sella Nevo on why AI model weights are so valuable to steal, the weaknesses of air-gapped networks, and the risks of USBs (episode #195)
Kevin Esvelt on what cryptographers can teach biosecurity experts (episode #164)
Lennart Heim on on Rob’s computer security nightmares (episode #155)
Zvi Mowshowitz on the insane lack of security mindset at some AI companies (episode #184)
Nova DasSarma on the best current defences against well-funded adversaries, politically motivated cyberattacks, and exciting progress in infosecurity (episode #132)
Bruce Schneier on whether AI could eliminate software bugs for good, and why it’s bad to hook everything up to the internet (episode #64)
Nita Farahany on the dystopian risks of hacked neurotech (episode #174)
Vitalik Buterin on how cybersecurity is the key to defence-dominant futures (episode #194)
Nathan Labenz on how even internal teams at AI companies may not know what they’re building (episode #176)
Allan Dafoe on backdooring your own AI to prevent theft (episode #212)
Tom Davidson on how dangerous “secret loyalties” in AI models could be (episode to be released!)
Carl Shulman on the challenge of trusting foreign AI models (episode #191, part 2)
Plus lots of concrete advice on how to get into this field and find your fit

Audio engineering: Ben Cordell, Milo McGuire, Simon Monsour, and Dominic Armstrong
Content editing: Katy Moore and Milo McGuire
Transcriptions and web: Katy Moore

Continue reading →

Preventing an AI-related catastrophe

Problem profile by Benjamin Hilton and the 80,000 Hours team · Last updated March 24th, 2025 · First published August 2022

We expect there will be substantial progress in AI in the coming years, potentially even to the point where machines come to outperform humans in many, if not all, tasks. This could have enormous benefits, helping to solve currently intractable global problems, but could also pose severe risks. These risks could arise accidentally (for example, if we don’t find technical solutions to concerns about the safety of AI systems), or deliberately (for example, if AI systems worsen geopolitical conflict). We think more work needs to be done to reduce these risks.

Some of these risks from advanced AI could be existential — meaning they could cause human extinction, or an equally permanent and severe disempowerment of humanity.² There have not yet been any satisfying answers to concerns — discussed below — about how this rapidly approaching, transformative technology can be safely developed and integrated into our society. Finding answers to these concerns is neglected and may well be tractable. We estimated that there were around 400 people worldwide working directly on this in 2022, though we believe that number has grown.³ As a result, the possibility of AI-related catastrophe may be the world’s most pressing problem — and the best thing to work on for those who are well-placed to contribute.

Promising options for working on this problem include technical research on how to create safe AI systems, strategy research into the particular risks AI might pose, and policy research into ways in which companies and governments could mitigate these risks. As policy approaches continue to be developed and refined, we need people to put them in place and implement them. There are also many opportunities to have a big impact in a variety of complementary roles, such as operations management, journalism, earning to give, and more — some of which we list below.

Continue reading →

The case for AGI by 2030

AI guide by Benjamin Todd · Published March 21st, 2025

In recent months, the CEOs of leading AI companies have grown increasingly confident about rapid progress:

OpenAI’s Sam Altman: Shifted from saying in November “the rate of progress continues” to declaring in January “we are now confident we know how to build AGI”
Anthropic’s Dario Amodei: Stated in January “I’m more confident than I’ve ever been that we’re close to powerful capabilities… in the next 2-3 years”
Google DeepMind’s Demis Hassabis: Changed from “as soon as 10 years” in autumn to “probably three to five years away” by January.

What explains the shift? Is it just hype? Or could we really have Artificial General Intelligence (AGI) by 2030?

In this article, I look at what’s driven recent progress, estimate how far those drivers can continue, and explain why they’re likely to continue for at least four more years.

In particular, while in 2024 progress in LLM chatbots seemed to slow, a new approach started to work: teaching the models to reason using reinforcement learning.

In just a year, this let them surpass human PhDs at answering difficult scientific reasoning questions, and achieve expert-level performance on one-hour coding tasks.

We don’t know how capable AI will become, but extrapolating the recent rate of progress suggests that, by 2028, we could reach AI models with beyond-human reasoning abilities, expert-level knowledge in every domain,

Continue reading →