#162 – Mustafa Suleyman on getting Washington and Silicon Valley to tame AI

Mustafa Suleyman was part of the trio that founded DeepMind, and his new AI project is building one of the world’s largest supercomputers to train a large language model on 10–100x the compute used to train ChatGPT.

But far from the stereotype of the incorrigibly optimistic tech founder, Mustafa is deeply worried about the future, for reasons he lays out in his new book The Coming Wave: Technology, Power, and the 21st Century’s Greatest Dilemma (coauthored with Michael Bhaskar). The future could be really good, but only if we grab the bull by the horns and solve the new problems technology is throwing at us.

On Mustafa’s telling, AI and biotechnology will soon be a huge aid to criminals and terrorists, empowering small groups to cause harm on previously unimaginable scales. Democratic countries have learned to walk a ‘narrow path’ between chaos on the one hand and authoritarianism on the other, avoiding the downsides that come from both extreme openness and extreme closure. AI could easily destabilise that present equilibrium, throwing us off dangerously in either direction. And ultimately, within our lifetimes humans may not need to work to live any more — or indeed, even have the option to do so.

And those are just three of the challenges confronting us. In Mustafa’s view, ‘misaligned’ AI that goes rogue and pursues its own agenda won’t be an issue for the next few years, and it isn’t a problem for the current style of large language models. But he thinks that at some point — in eight, ten, or twelve years — it will become an entirely legitimate concern, and says that we need to be planning ahead.

In The Coming Wave, Mustafa lays out a 10-part agenda for ‘containment’ — that is to say, for limiting the negative and unforeseen consequences of emerging technologies:

  1. Developing an Apollo programme for technical AI safety
  2. Instituting capability audits for AI models
  3. Buying time by exploiting hardware choke points
  4. Getting critics involved in directly engineering AI models
  5. Getting AI labs to be guided by motives other than profit
  6. Radically increasing governments’ understanding of AI and their capabilities to sensibly regulate it
  7. Creating international treaties to prevent proliferation of the most dangerous AI capabilities
  8. Building a self-critical culture in AI labs of openly accepting when the status quo isn’t working
  9. Creating a mass public movement that understands AI and can demand the necessary controls
  10. Not relying too much on delay, but instead seeking to move into a new somewhat-stable equilibria

As Mustafa put it, “AI is a technology with almost every use case imaginable” and that will demand that, in time, we rethink everything.

Rob and Mustafa discuss the above, as well as:

  • Whether we should be open sourcing AI models
  • Whether Mustafa’s policy views are consistent with his timelines for transformative AI
  • How people with very different views on these issues get along at AI labs
  • The failed efforts (so far) to get a wider range of people involved in these decisions
  • Whether it’s dangerous for Mustafa’s new company to be training far larger models than GPT-4
  • Whether we’ll be blown away by AI progress over the next year
  • What mandatory regulations government should be imposing on AI labs right now
  • Appropriate priorities for the UK’s upcoming AI safety summit

Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type ‘80,000 Hours’ into your podcasting app. Or read the transcript.

Producer and editor: Keiran Harris
Audio Engineering Lead: Ben Cordell
Technical editing: Milo McGuire
Transcriptions: Katy Moore

Continue reading →

#161 – Michael Webb on whether AI will soon cause job loss, lower incomes, and higher inequality — or the opposite

In today’s episode, host Luisa Rodriguez interviews economist Michael Webb of DeepMind, the British Government, and Stanford about how AI progress is going to affect people’s jobs and the labour market.

They cover:

  • The jobs most and least exposed to AI
  • Whether we’ll we see mass unemployment in the short term
  • How long it took other technologies like electricity and computers to have economy-wide effects
  • Whether AI will increase or decrease inequality
  • Whether AI will lead to explosive economic growth
  • What we can we learn from history, and reasons to think this time is different
  • Career advice for a world of LLMs
  • Why Michael is starting a new org to relieve talent bottlenecks through accelerated learning, and how you can get involved
  • Michael’s take as a musician on AI-generated music
  • And plenty more

If you’d like to work with Michael on his new org to radically accelerate how quickly people acquire expertise in critical cause areas, he’s now hiring! Check out Quantum Leap’s website.

Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type ‘80,000 Hours’ into your podcasting app. Or read the transcript.

Producer and editor: Keiran Harris
Audio Engineering Lead: Ben Cordell
Technical editing: Simon Monsour and Milo McGuire
Additional content editing: Katy Moore and Luisa Rodriguez
Transcriptions: Katy Moore

Continue reading →

#160 – Hannah Ritchie on why it makes sense to be optimistic about the environment

In today’s episode, host Luisa Rodriguez interviews the head of research at Our World in Data — Hannah Ritchie — on the case for environmental optimism.

They cover:

  • Why agricultural productivity in sub-Saharan Africa could be so important, and how much better things could get
  • Her new book about how we could be the first generation to build a sustainable planet
  • Whether climate change is the most worrying environmental issue
  • How we reduced outdoor air pollution
  • Why Hannah is worried about the state of biodiversity
  • Solutions that address multiple environmental issues at once
  • How the world coordinated to address the hole in the ozone layer
  • Surprises from Our World in Data’s research
  • Psychological challenges that come up in Hannah’s work
  • And plenty more

Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type ‘80,000 Hours’ into your podcasting app. Or read the transcript.

Producer and editor: Keiran Harris
Audio Engineering Lead: Ben Cordell
Technical editing: Milo McGuire and Dominic Armstrong
Additional content editing: Katy Moore and Luisa Rodriguez
Transcriptions: Katy Moore

Continue reading →

#159 – Jan Leike on OpenAI's massive push to make superintelligence safe in 4 years or less

In July, OpenAI announced a new team and project: Superalignment. The goal is to figure out how to make superintelligent AI systems aligned and safe to use within four years, and the lab is putting a massive 20% of its computational resources behind the effort.

Today’s guest, Jan Leike, is Head of Alignment at OpenAI and will be co-leading the project. As OpenAI puts it, “…the vast power of superintelligence could be very dangerous, and lead to the disempowerment of humanity or even human extinction. … Currently, we don’t have a solution for steering or controlling a potentially superintelligent AI, and preventing it from going rogue.”

Given that OpenAI is in the business of developing superintelligent AI, it sees that as a scary problem that urgently has to be fixed. So it’s not just throwing compute at the problem — it’s also hiring dozens of scientists and engineers to build out the Superalignment team.

Plenty of people are pessimistic that this can be done at all, let alone in four years. But Jan is guardedly optimistic. As he explains:

Honestly, it really feels like we have a real angle of attack on the problem that we can actually iterate on… and I think it’s pretty likely going to work, actually. And that’s really, really wild, and it’s really exciting. It’s like we have this hard problem that we’ve been talking about for years and years and years, and now we have a real shot at actually solving it. And that’d be so good if we did.

Jan thinks that this work is actually the most scientifically interesting part of machine learning. Rather than just throwing more chips and more data at a training run, this work requires actually understanding how these models work and how they think. The answers are likely to be breakthroughs on the level of solving the mysteries of the human brain.

The plan, in a nutshell, is to get AI to help us solve alignment. That might sound a bit crazy — as one person described it, “like using one fire to put out another fire.”

But Jan’s thinking is this: the core problem is that AI capabilities will keep getting better and the challenge of monitoring cutting-edge models will keep getting harder, while human intelligence stays more or less the same. To have any hope of ensuring safety, we need our ability to monitor, understand, and design ML models to advance at the same pace as the complexity of the models themselves.

And there’s an obvious way to do that: get AI to do most of the work, such that the sophistication of the AIs that need aligning, and the sophistication of the AIs doing the aligning, advance in lockstep.

Jan doesn’t want to produce machine learning models capable of doing ML research. But such models are coming, whether we like it or not. And at that point Jan wants to make sure we turn them towards useful alignment and safety work, as much or more than we use them to advance AI capabilities.

Jan thinks it’s so crazy it just might work. But some critics think it’s simply crazy. They ask a wide range of difficult questions, including:

  • If you don’t know how to solve alignment, how can you tell that your alignment assistant AIs are actually acting in your interest rather than working against you? Especially as they could just be pretending to care about what you care about.
  • How do you know that these technical problems can be solved at all, even in principle?
  • At the point that models are able to help with alignment, won’t they also be so good at improving capabilities that we’re in the middle of an explosion in what AI can do?

In today’s interview host Rob Wiblin puts these doubts to Jan to hear how he responds to each, and they also cover:

  • OpenAI’s current plans to achieve ‘superalignment’ and the reasoning behind them
  • Why alignment work is the most fundamental and scientifically interesting research in ML
  • The kinds of people he’s excited to hire to join his team and maybe save the world
  • What most readers misunderstood about the OpenAI announcement
  • The three ways Jan expects AI to help solve alignment: mechanistic interpretability, generalization, and scalable oversight
  • What the standard should be for confirming whether Jan’s team has succeeded
  • Whether OpenAI should (or will) commit to stop training more powerful general models if they don’t think the alignment problem has been solved
  • Whether Jan thinks OpenAI has deployed models too quickly or too slowly
  • The many other actors who also have to do their jobs really well if we’re going to have a good AI future
  • Plenty more

Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type ‘80,000 Hours’ into your podcasting app. Or read the transcript below.

Producer and editor: Keiran Harris
Audio Engineering Lead: Ben Cordell
Technical editing: Simon Monsour and Milo McGuire
Additional content editing: Katy Moore and Luisa Rodriguez
Transcriptions: Katy Moore

Continue reading →

#158 – Holden Karnofsky on how AIs might take over even if they're no smarter than humans, and his 4-part playbook for AI risk

Back in 2007, Holden Karnofsky cofounded GiveWell, where he sought out the charities that most cost-effectively helped save lives. He then cofounded Open Philanthropy, where he oversaw a team making billions of dollars’ worth of grants across a range of areas: pandemic control, criminal justice reform, farmed animal welfare, and making AI safe, among others. This year, having learned about AI for years and observed recent events, he’s narrowing his focus once again, this time on making the transition to advanced AI go well.

In today’s conversation, Holden returns to the show to share his overall understanding of the promise and the risks posed by machine intelligence, and what to do about it. That understanding has accumulated over around 14 years, during which he went from being sceptical that AI was important or risky, to making AI risks the focus of his work.

(As Holden reminds us, his wife is also the president of one of the world’s top AI labs, Anthropic, giving him both conflicts of interest and a front-row seat to recent events. For our part, Open Philanthropy is 80,000 Hours’ largest financial supporter.)

One point he makes is that people are too narrowly focused on AI becoming ‘superintelligent.’ While that could happen and would be important, it’s not necessary for AI to be transformative or perilous. Rather, machines with human levels of intelligence could end up being enormously influential simply if the amount of computer hardware globally were able to operate tens or hundreds of billions of them, in a sense making machine intelligences a majority of the global population, or at least a majority of global thought.

As Holden explains, he sees four key parts to the playbook humanity should use to guide the transition to very advanced AI in a positive direction: alignment research, standards and monitoring, creating a successful and careful AI lab, and finally, information security.

In today’s episode, host Rob Wiblin interviews return guest Holden Karnofsky about that playbook, as well as:

  • Why we can’t rely on just gradually solving those problems as they come up, the way we usually do with new technologies.
  • What multiple different groups can do to improve our chances of a good outcome — including listeners to this show, governments, computer security experts, and journalists.
  • Holden’s case against ‘hardcore utilitarianism’ and what actually motivates him to work hard for a better world.
  • What the ML and AI safety communities get wrong in Holden’s view.
  • Ways we might succeed with AI just by dumb luck.
  • The value of laying out imaginable success stories.
  • Why information security is so important and underrated.
  • Whether it’s good to work at an AI lab that you think is particularly careful.
  • The track record of futurists’ predictions.
  • And much more.

Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type ‘80,000 Hours’ into your podcasting app. Or read the transcript below.

Producer: Keiran Harris
Audio Engineering Lead: Ben Cordell
Technical editing: Simon Monsour and Milo McGuire
Transcriptions: Katy Moore

Continue reading →

#157 – Ezra Klein on existential risk from AI and what DC could do about it

In Oppenheimer, scientists detonate a nuclear weapon despite thinking there’s some ‘near zero’ chance it would ignite the atmosphere, putting an end to life on Earth. Today, scientists working on AI think the chance their work puts an end to humanity is vastly higher than that.

In response, some have suggested we launch a Manhattan Project to make AI safe via enormous investment in relevant R&D. Others have suggested that we need international organisations modelled on those that slowed the proliferation of nuclear weapons. Others still seek a research slowdown by labs while an auditing and licencing scheme is created.

Today’s guest — journalist Ezra Klein of The New York Times — has watched policy discussions and legislative battles play out in DC for 20 years. Like many people he has also taken a big interest in AI this year, writing articles such as “This changes everything.” In his first interview on the show in 2021, he flagged AI as one topic that DC would regret not having paid more attention to.

So we invited him on to get his take on which regulatory proposals have promise, and which seem either unhelpful or politically unviable.

Out of the ideas on the table right now, Ezra favours a focus on direct government funding — both for AI safety research and to develop AI models designed to solve problems other than making money for their operators. He is sympathetic to legislation that would require AI models to be legible in a way that none currently are — and embraces the fact that that will slow down the release of models while businesses figure out how their products actually work.

By contrast, he’s pessimistic that it’s possible to coordinate countries around the world to agree to prevent or delay the deployment of dangerous AI models — at least not unless there’s some spectacular AI-related disaster to create such a consensus. And he fears attempts to require licences to train the most powerful ML models will struggle unless they can find a way to exclude and thereby appease people working on relatively safe consumer technologies rather than cutting-edge research.

From observing how DC works, Ezra expects that even a small community of experts in AI governance can have a large influence on how the the US government responds to AI advances. But in Ezra’s view, that requires those experts to move to DC and spend years building relationships with people in government, rather than clustering elsewhere in academia and AI labs.

In today’s brisk conversation, Ezra and host Rob Wiblin cover the above as well as:

  • Whether it’s desirable to slow down AI research
  • The value of engaging with current policy debates even if they don’t seem directly important
  • Which AI business models seem more or less dangerous
  • Tensions between people focused on existing vs emergent risks from AI
  • Two major challenges of being a new parent

Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type ‘80,000 Hours’ into your podcasting app. Or read the transcript below.

Producer: Keiran Harris
Audio Engineering Lead: Ben Cordell
Technical editing: Milo McGuire
Transcriptions: Katy Moore

Continue reading →

#156 – Markus Anderljung on how to regulate cutting-edge AI models

In today’s episode, host Luisa Rodriguez interviews the Head of Policy at the Centre for the Governance of AI — Markus Anderljung — about all aspects of policy and governance of superhuman AI systems.

They cover:

  • The need for AI governance, including self-replicating models and ChaosGPT
  • Whether or not AI companies will willingly accept regulation
  • The key regulatory strategies including licencing, risk assessment, auditing, and post-deployment monitoring
  • Whether we can be confident that people won’t train models covertly and ignore the licencing system
  • The progress we’ve made so far in AI governance
  • The key weaknesses of these approaches
  • The need for external scrutiny of powerful models
  • The emergent capabilities problem
  • Why it really matters where regulation happens
  • Advice for people wanting to pursue a career in this field
  • And much more.

Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type ‘80,000 Hours’ into your podcasting app. Or read the transcript below.

Producer: Keiran Harris
Audio Engineering Lead: Ben Cordell
Technical editing: Simon Monsour and Milo McGuire
Transcriptions: Katy Moore

Continue reading →

#155 – Lennart Heim on the compute governance era and what has to come after

As AI advances ever more quickly, concerns about potential misuse of highly capable models are growing. From hostile foreign governments and terrorists to reckless entrepreneurs, the threat of AI falling into the wrong hands is top of mind for the national security community.

With growing concerns about the use of AI in military applications, the US has banned the export of certain types of chips to China.

But unlike the uranium required to make nuclear weapons, or the material inputs to a bioweapons programme, computer chips and machine learning models are absolutely everywhere. So is it actually possible to keep dangerous capabilities out of the wrong hands?

In today’s interview, Lennart Heim — who researches compute governance at the Centre for the Governance of AI — explains why limiting access to supercomputers may represent our best shot.

As Lennart explains, an AI research project requires many inputs, including the classic triad of compute, algorithms, and data.

If we want to limit access to the most advanced AI models, focusing on access to supercomputing resources — usually called ‘compute’ — might be the way to go. Both algorithms and data are hard to control because they live on hard drives and can be easily copied. By contrast, advanced chips are physical items that can’t be used by multiple people at once and come from a small number of sources.

According to Lennart, the hope would be to enforce AI safety regulations by controlling access to the most advanced chips specialised for AI applications. For instance, projects training ‘frontier’ AI models — the newest and most capable models — might only gain access to the supercomputers they need if they obtain a licence and follow industry best practices.

We have similar safety rules for companies that fly planes or manufacture volatile chemicals — so why not for people producing the most powerful and perhaps the most dangerous technology humanity has ever played with?

But Lennart is quick to note that the approach faces many practical challenges. Currently, AI chips are readily available and untracked. Changing that will require the collaboration of many actors, which might be difficult, especially given that some of them aren’t convinced of the seriousness of the problem.

Host Rob Wiblin is particularly concerned about a different challenge: the increasing efficiency of AI training algorithms. As these algorithms become more efficient, what once required a specialised AI supercomputer to train might soon be achievable with a home computer.

By that point, tracking every aggregation of compute that could prove to be very dangerous would be both impractical and invasive.

With only a decade or two left before that becomes a reality, the window during which compute governance is a viable solution may be a brief one. Top AI labs have already stopped publishing their latest algorithms, which might extend this ‘compute governance era’, but not for very long.

If compute governance is only a temporary phase between the era of difficult-to-train superhuman AI models and the time when such models are widely accessible, what can we do to prevent misuse of AI systems after that point?

Lennart and Rob both think the only enduring approach requires taking advantage of the AI capabilities that should be in the hands of police and governments — which will hopefully remain superior to those held by criminals, terrorists, or fools. But as they describe, this means maintaining a peaceful standoff between AI models with conflicting goals that can act and fight with one another on the microsecond timescale. Being far too slow to follow what’s happening — let alone participate — humans would have to be cut out of any defensive decision-making.

Both agree that while this may be our best option, such a vision of the future is more terrifying than reassuring.

Lennart and Rob discuss the above as well as:

  • How can we best categorise all the ways AI could go wrong?
  • Why did the US restrict the export of some chips to China and what impact has that had?
  • Is the US in an ‘arms race’ with China or is that more an illusion?
  • What is the deal with chips specialised for AI applications?
  • How is the ‘compute’ industry organised?
  • Downsides of using compute as a target for regulations
  • Could safety mechanisms be built into computer chips themselves?
  • Who would have the legal authority to govern compute if some disaster made it seem necessary?
  • The reasons Rob doubts that any of this stuff will work
  • Could AI be trained to operate as a far more severe computer worm than any we’ve seen before?
  • What does the world look like when sluggish human reaction times leave us completely outclassed?
  • And plenty more

Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type ‘80,000 Hours’ into your podcasting app. Or read the transcript below.

Producer: Keiran Harris
Audio mastering: Milo McGuire, Dominic Armstrong, and Ben Cordell
Transcriptions: Katy Moore

Continue reading →

#154 – Rohin Shah on DeepMind and trying to fairly hear out both AI doomers and doubters

Can there be a more exciting and strange place to work today than a leading AI lab? Your CEO has said they’re worried your research could cause human extinction. The government is setting up meetings to discuss how this outcome can be avoided. Some of your colleagues think this is all overblown; others are more anxious still.

Today’s guest — machine learning researcher Rohin Shah — goes into the Google DeepMind offices each day with that peculiar backdrop to his work.

He’s on the team dedicated to maintaining ‘technical AI safety’ as these models approach and exceed human capabilities: basically that the models help humanity accomplish its goals without flipping out in some dangerous way. This work has never seemed more important.

In the short-term it could be the key bottleneck to deploying ML models in high-stakes real-life situations. In the long-term, it could be the difference between humanity thriving and disappearing entirely.

For years Rohin has been on a mission to fairly hear out people across the full spectrum of opinion about risks from artificial intelligence — from doomers to doubters — and properly understand their point of view. That makes him unusually well placed to give an overview of what we do and don’t understand. He has landed somewhere in the middle — troubled by ways things could go wrong, but not convinced there are very strong reasons to expect a terrible outcome.

Today’s conversation is wide-ranging and Rohin lays out many of his personal opinions to host Rob Wiblin, including:

  • What he sees as the strongest case both for and against slowing down the rate of progress in AI research.
  • Why he disagrees with most other ML researchers that training a model on a sensible ‘reward function’ is enough to get a good outcome.
  • Why he disagrees with many on LessWrong that the bar for whether a safety technique is helpful is “could this contain a superintelligence.”
  • That he thinks nobody has very compelling arguments that AI created via machine learning will be dangerous by default, or that it will be safe by default. He believes we just don’t know.
  • That he understands that analogies and visualisations are necessary for public communication, but is sceptical that they really help us understand what’s going on with ML models, because they’re different in important ways from every other case we might compare them to.
  • Why he’s optimistic about DeepMind’s work on scalable oversight, mechanistic interpretability, and dangerous capabilities evaluations, and what each of those projects involves.
  • Why he isn’t inherently worried about a future where we’re surrounded by beings far more capable than us, so long as they share our goals to a reasonable degree.
  • Why it’s not enough for humanity to know how to align AI models — it’s essential that management at AI labs correctly pick which methods they’re going to use and have the practical know-how to apply them properly.
  • Three observations that make him a little more optimistic: humans are a bit muddle-headed and not super goal-orientated; planes don’t crash; and universities have specific majors in particular subjects.
  • Plenty more besides.

Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type ‘80,000 Hours’ into your podcasting app. Or read the transcript below.

Producer: Keiran Harris
Audio mastering: Milo McGuire, Dominic Armstrong, and Ben Cordell
Transcriptions: Katy Moore

Continue reading →

#153 – Elie Hassenfeld on two big picture critiques of GiveWell's approach, and six lessons from their recent work

GiveWell is one of the world’s best-known charity evaluators, with the goal of “searching for the charities that save or improve lives the most per dollar.” It mostly recommends projects that help the world’s poorest people avoid easily prevented diseases, like intestinal worms or vitamin A deficiency.

But should GiveWell, as some critics argue, take a totally different approach to its search, focusing instead on directly increasing subjective wellbeing, or alternatively, raising economic growth?

Today’s guest — cofounder and CEO of GiveWell, Elie Hassenfeld — is proud of how much GiveWell has grown in the last five years. Its ‘money moved’ has quadrupled to around $600 million a year.

Its research team has also more than doubled, enabling them to investigate a far broader range of interventions that could plausibly help people an enormous amount for each dollar spent. That work has led GiveWell to support dozens of new organisations, such as Kangaroo Mother Care, MiracleFeet, and Dispensers for Safe Water.

But some other researchers focused on figuring out the best ways to help the world’s poorest people say GiveWell shouldn’t just do more of the same thing, but rather ought to look at the problem differently.

Currently, GiveWell uses a range of metrics to track the impact of the organisations it considers recommending — such as ‘lives saved,’ ‘household incomes doubled,’ and for health improvements, the ‘quality-adjusted life year.’ To compare across opportunities, it then needs some way of weighing these different types of benefits up against one another. This requires estimating so-called “moral weights,” which Elie agrees is far from the most mature part of the project.

The Happier Lives Institute (HLI) has argued that instead, GiveWell should try to cash out the impact of all interventions in terms of improvements in subjective wellbeing. According to HLI, it’s improvements in wellbeing and reductions in suffering that are the true ultimate goal of all projects, and if you quantify everyone on this same scale, using some measure like the wellbeing-adjusted life year (WELLBY), you have an easier time comparing them.

This philosophy has led HLI to be more sceptical of interventions that have been demonstrated to improve health, but whose impact on wellbeing has not been measured, and to give a high priority to improving lives relative to extending them.

An alternative high-level critique is that really all that matters in the long run is getting the economies of poor countries to grow. According to this line of argument, hundreds of millions fewer people live in poverty in China today than 50 years ago, but is that because of the delivery of basic health treatments? Maybe a little), but mostly not.

Rather, it’s because changes in economic policy and governance in China allowed it to experience a 10% rate of economic growth for several decades. That led to much higher individual incomes and meant the country could easily afford all the basic health treatments GiveWell might otherwise want to fund, and much more besides.

On this view, GiveWell should focus on figuring out what causes some countries to experience explosive economic growth while others fail to, or even go backwards. Even modest improvements in the chances of such a ‘growth miracle’ will likely offer a bigger bang-for-buck than funding the incremental delivery of deworming tablets or vitamin A supplements, or anything else.

Elie sees where both of these critiques are coming from, and notes that they’ve influenced GiveWell’s work in some ways. But as he explains, he thinks they underestimate the practical difficulty of successfully pulling off either approach and finding better opportunities than what GiveWell funds today.

In today’s in-depth conversation, Elie and host Rob Wiblin cover the above, as well as:

  • The research that caused GiveWell to flip from not recommending chlorine dispensers as an intervention for safe drinking water to spending tens of millions of dollars on them.
  • What transferable lessons GiveWell learned from investigating different kinds of interventions, like providing medical expertise to hospitals in very poor countries to help them improve their practices.
  • Why the best treatment for premature babies in low-resource settings may involve less rather than more medicine.
  • The high prevalence of severe malnourishment among children and what can be done about it.
  • How to deal with hidden and non-obvious costs of a programme, like taking up a hospital room that might otherwise have been used for something else.
  • Some cheap early treatments that can prevent kids from developing lifelong disabilities, which GiveWell funds.
  • The various roles GiveWell is currently hiring for, and what’s distinctive about their organisational culture.

Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type ‘80,000 Hours’ into your podcasting app. Or read the transcript below.

Producer: Keiran Harris
Audio mastering: Simon Monsour and Ben Cordell
Transcriptions: Katy Moore

Continue reading →

#152 – Joe Carlsmith on navigating serious philosophical confusion

What is the nature of the universe? How do we make decisions correctly? What differentiates right actions from wrong ones?

Such fundamental questions have been the subject of philosophical and theological debates for millennia. But, as we all know, and surveys of expert opinion make clear, we are very far from agreement. So… with these most basic questions unresolved, what’s a species to do?

In today’s episode, philosopher Joe Carlsmith — Senior Research Analyst at Open Philanthropy — makes the case that many current debates in philosophy ought to leave us confused and humbled. These are themes he discusses in his PhD thesis, A stranger priority? Topics at the outer reaches of effective altruism.

To help transmit the disorientation he thinks is appropriate, Joe presents three disconcerting theories — originating from him and his peers — that challenge humanity’s self-assured understanding of the world.

The first idea is that we might be living in a computer simulation, because, in the classic formulation, if most civilisations go on to run many computer simulations of their past history, then most beings who perceive themselves as living in such a history must themselves be in computer simulations. Joe prefers a somewhat different way of making the point, but, having looked into it, he hasn’t identified any particular rebuttal to this ‘simulation argument.’

If true, it could revolutionise our comprehension of the universe and the way we ought to live.

The second is the idea that “you can ‘control’ events you have no causal interaction with, including events in the past.” The thought experiment that most persuades him of this is the following:

Perfect deterministic twin prisoner’s dilemma: You’re a deterministic AI system, who only wants money for yourself (you don’t care about copies of yourself). The authorities make a perfect copy of you, separate you and your copy by a large distance, and then expose you both, in simulation, to exactly identical inputs (let’s say, a room, a whiteboard, some markers, etc.). You both face the following choice: either (a) send a million dollars to the other (“cooperate”), or (b) take a thousand dollars for yourself (“defect”).

Joe thinks, in contrast with the dominant theory of correct decision-making, that it’s clear you should send a million dollars to your twin. But as he explains, this idea, when extrapolated outwards to other cases, implies that it could be sensible to take actions in the hope that they’ll improve parallel universes you can never causally interact with — or even to improve the past. That is nuts by anyone’s lights, including Joe’s.

The third disorienting idea is that, as far as we can tell, the universe could be infinitely large. And that fact, if true, would mean we probably have to make choices between actions and outcomes that involve infinities. Unfortunately, doing that breaks our existing ethical systems, which are only designed to accommodate finite cases.

In an infinite universe, our standard models end up unable to say much at all, or give the wrong answers entirely. While we might hope to patch them in straightforward ways, having looked into ways we might do that, Joe has concluded they all quickly get complicated and arbitrary, and still have to do enormous violence to our common sense. For people inclined to endorse some flavour of utilitarianism, Joe thinks ‘infinite ethics’ spell the end of the ‘utilitarian dream‘ of a moral philosophy that has the virtue of being very simple while still matching our intuitions in most cases.

These are just three particular instances of a much broader set of ideas that some have dubbed the “train to crazy town.” Basically, if you commit to always take philosophy and arguments seriously, and try to act on them, it can lead to what seem like some pretty crazy and impractical places. So what should we do with this buffet of plausible-sounding but bewildering arguments?

Joe and Rob discuss to what extent this should prompt us to pay less attention to philosophy, and how we as individuals can cope psychologically with feeling out of our depth just trying to make the most basic sense of the world.

In the face of all of this, Joe suggests that there is a promising and robust path for humanity to take: keep our options open and put our descendants in a better position to figure out the answers to questions that seem impossible for us to resolve today — a position he calls “wisdom longtermism.”

Joe fears that if people believe we understand the universe better than we really do, they’ll be more likely to try to commit humanity to a particular vision of the future, or be uncooperative to others, in ways that only make sense if you were certain you knew what was right and wrong.

In today’s challenging conversation, Joe and Rob discuss all of the above, as well as:

  • What Joe doesn’t like about the drowning child thought experiment
  • An alternative thought experiment about helping a stranger that might better highlight our intrinsic desire to help others
  • What Joe doesn’t like about the expression “the train to crazy town”
  • Whether Elon Musk should place a higher probability on living in a simulation than most other people
  • Whether the deterministic twin prisoner’s dilemma, if fully appreciated, gives us an extra reason to keep promises
  • To what extent learning to doubt our own judgement about difficult questions — so-called “epistemic learned helplessness” — is a good thing
  • How strong the case is that advanced AI will engage in generalised power-seeking behaviour

Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type ‘80,000 Hours’ into your podcasting app. Or read the transcript below.

Producer: Keiran Harris
Audio mastering: Milo McGuire and Ben Cordell
Transcriptions: Katy Moore

Continue reading →

#151 – Ajeya Cotra on accidentally teaching AI models to deceive us

Imagine you are an orphaned eight-year-old whose parents left you a $1 trillion company, and no trusted adult to serve as your guide to the world. You have to hire a smart adult to run that company, guide your life the way that a parent would, and administer your vast wealth. You have to hire that adult based on a work trial or interview you come up with. You don’t get to see any resumes or do reference checks. And because you’re so rich, tonnes of people apply for the job — for all sorts of reasons.

Today’s guest Ajeya Cotra — senior research analyst at Open Philanthropy — argues that this peculiar setup resembles the situation humanity finds itself in when training very general and very capable AI models using current deep learning methods.

As she explains, such an eight-year-old faces a challenging problem. In the candidate pool there are likely some truly nice people, who sincerely want to help and make decisions that are in your interest. But there are probably other characters too — like people who will pretend to care about you while you’re monitoring them, but intend to use the job to enrich themselves as soon as they think they can get away with it.

Like a child trying to judge adults, at some point humans will be required to judge the trustworthiness and reliability of machine learning models that are as goal-oriented as people, and greatly outclass them in knowledge, experience, breadth, and speed. Tricky!

Can’t we rely on how well models have performed at tasks during training to guide us? Ajeya worries that it won’t work. The trouble is that three different sorts of models will all produce the same output during training, but could behave very differently once deployed in a setting that allows their true colours to come through. She describes three such motivational archetypes:

  • Saints — models that care about doing what we really want
  • Sycophants — models that just want us to say they’ve done a good job, even if they get that praise by taking actions they know we wouldn’t want them to
  • Schemers — models that don’t care about us or our interests at all, who are just pleasing us so long as that serves their own agenda

In principle, a machine learning training process based on reinforcement learning could spit out any of these three attitudes, because all three would perform roughly equally well on the tests we give them, and ‘performs well on tests’ is how these models are selected.

But while that’s true in principle, maybe it’s not something that could plausibly happen in the real world. After all, if we train an agent based on positive reinforcement for accomplishing X, shouldn’t the training process spit out a model that plainly does X and doesn’t have complex thoughts and goals beyond that?

According to Ajeya, this is one thing we don’t know, and should be trying to test empirically as these models get more capable. For reasons she explains in the interview, the Sycophant or Schemer models may in fact be simpler and easier for the learning algorithm to creep towards than their Saint counterparts.

But there are also ways we could end up actively selecting for motivations that we don’t want.

For a toy example, let’s say you train an agent AI model to run a small business, and select it for behaviours that make money, measuring its success by whether it manages to get more money in its bank account. During training, a highly capable model may experiment with the strategy of tricking its raters into thinking it has made money legitimately when it hasn’t. Maybe instead it steals some money and covers that up. This isn’t exactly unlikely; during training, models often come up with creative — sometimes undesirable — approaches that their developers didn’t anticipate.

If such deception isn’t picked up, a model like this may be rated as particularly successful, and the training process will cause it to develop a progressively stronger tendency to engage in such deceptive behaviour. A model that has the option to engage in deception when it won’t be detected would, in effect, have a competitive advantage.

What if deception is picked up, but just some of the time? Would the model then learn that honesty is the best policy? Maybe. But alternatively, it might learn the ‘lesson’ that deception does pay, but you just have to do it selectively and carefully, so it can’t be discovered. Would that actually happen? We don’t yet know, but it’s possible.

In today’s interview, Ajeya and Rob discuss the above, as well as:

  • How to predict the motivations a neural network will develop through training
  • Whether AIs being trained will functionally understand that they’re AIs being trained, the same way we think we understand that we’re humans living on planet Earth
  • Stories of AI misalignment that Ajeya doesn’t buy into
  • Analogies for AI, from octopuses to aliens to can openers
  • Why it’s smarter to have separate planning AIs and doing AIs
  • The benefits of only following through on AI-generated plans that make sense to human beings
  • What approaches for fixing alignment problems Ajeya is most excited about, and which she thinks are overrated
  • How one might demo actually scary AI failure mechanisms

Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type ‘80,000 Hours’ into your podcasting app. Or read the transcript below.

Producer: Keiran Harris
Audio mastering: Ryan Kessler and Ben Cordell
Transcriptions: Katy Moore

Continue reading →

#150 – Tom Davidson on how quickly AI could transform the world

It’s easy to dismiss alarming AI-related predictions when you don’t know where the numbers came from.

For example: what if we told you that within 15 years, it’s likely that we’ll see a 1,000x improvement in AI capabilities in a single year? And what if we then told you that those improvements would lead to explosive economic growth unlike anything humanity has seen before?

You might think, “Congratulations, you said a big number — but this kind of stuff seems crazy, so I’m going to keep scrolling through Twitter.”

But this 1,000x yearly improvement is a prediction based on real economic models created by today’s guest Tom Davidson, Senior Research Analyst at Open Philanthropy. By the end of the episode, you’ll either be able to point out specific flaws in his step-by-step reasoning, or have to at least consider the idea that the world is about to get — at a minimum — incredibly weird.

As a teaser, consider the following:

Developing artificial general intelligence (AGI) — AI that can do 100% of cognitive tasks at least as well as the best humans can — could very easily lead us to an unrecognisable world.

You might think having to train AI systems individually to do every conceivable cognitive task — one for diagnosing diseases, one for doing your taxes, one for teaching your kids, etc. — sounds implausible, or at least like it’ll take decades.

But Tom thinks we might not need to train AI to do every single job — we might just need to train it to do one: AI research.

And building AI capable of doing research and development might be a much easier task — especially given that the researchers training the AI are AI researchers themselves.

And once an AI system is as good at accelerating future AI progress as the best humans are today — and we can run billions of copies of it round the clock — it’s hard to make the case that we won’t achieve AGI very quickly.

To give you some perspective: 17 years ago we saw the launch of Twitter, the release of Al Gore’s An Inconvenient Truth, and your first chance to play the Nintendo Wii.

Tom thinks that if we have AI that significantly accelerates AI R&D, then it’s hard to imagine not having AGI 17 years from now.

Wild.

Host Luisa Rodriguez gets Tom to walk us through his careful reports on the topic, and how he came up with these numbers, across a terrifying but fascinating three hours.

Luisa and Tom also discuss:

  • How we might go from GPT-4 to AI disaster
  • Tom’s journey from finding AI risk to be kind of scary to really scary
  • Whether international cooperation or an anti-AI social movement can slow AI progress down
  • Why it might take just a few years to go from pretty good AI to superhuman AI
  • How quickly the number and quality of computer chips we’ve been using for AI have been increasing
  • The pace of algorithmic progress
  • What ants can teach us about AI
  • And much more

Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type ‘80,000 Hours’ into your podcasting app. Or read the transcript below.

Producer: Keiran Harris
Audio mastering: Simon Monsour and Ben Cordell
Transcriptions: Katy Moore

Continue reading →

#149 – Tim LeBon on how altruistic perfectionism is self-defeating

Being a good and successful person is core to your identity. You place great importance on meeting the high moral, professional, or academic standards you set yourself.

But inevitably, something goes wrong and you fail to meet that high bar. Now you feel terrible about yourself, and worry others are judging you for your failure. Feeling low and reflecting constantly on whether you’re doing as much as you think you should makes it hard to focus and get things done. So now you’re performing below a normal level, making you feel even more ashamed of yourself. Rinse and repeat.

This is the disastrous cycle today’s guest, Tim LeBon — registered psychotherapist, accredited CBT therapist, life coach, and author of 365 Ways to Be More Stoic — has observed in many clients with a perfectionist mindset.

Tim has provided therapy to a number of 80,000 Hours readers — people who have found that the very high expectations they had set for themselves were holding them back. Because of our focus on “doing the most good you can,” Tim thinks 80,000 Hours both attracts people with this style of thinking and then exacerbates it.

But Tim, having studied and written on moral philosophy, is sympathetic to the idea of helping others as much as possible, and is excited to help clients pursue that — sustainably — if it’s their goal.

Tim has treated hundreds of clients with all sorts of mental health challenges. But in today’s conversation, he shares the lessons he has learned working with people who take helping others so seriously that it has become burdensome and self-defeating — in particular, how clients can approach this challenge using the treatment he’s most enthusiastic about: cognitive behavioural therapy.

As Tim stresses, perfectionism isn’t the same as being perfect, or simply pursuing excellence. What’s most distinctive about perfectionism is that a person’s standards don’t vary flexibly according to circumstance, meeting those standards without exception is key to their self-image, and they worry something terrible will happen if they fail to meet them.

It’s a mindset most of us have seen in ourselves at some point, or have seen people we love struggle with.

Untreated, perfectionism might not cause problems for many years — it might even seem positive providing a source of motivation to work hard. But it’s hard to feel truly happy and secure, and free to take risks, when we’re just one failure away from our self-worth falling through the floor. And if someone slips into the positive feedback loop of shame described above, the end result can be depression and anxiety that’s hard to shake.

But there’s hope. Tim has seen clients make real progress on their perfectionism by using CBT techniques like exposure therapy. By doing things like experimenting with more flexible standards — for example, sending early drafts to your colleagues, even if it terrifies you — you can learn that things will be okay, even when you’re not perfect.

In today’s extensive conversation, Tim and Rob cover:

  • How perfectionism is different from the pursuit of excellence, scrupulosity, or an OCD personality
  • What leads people to adopt a perfectionist mindset
  • The pros and cons of perfectionism
  • How 80,000 Hours contributes to perfectionism among some readers and listeners, and what it might change about its advice to address this
  • What happens in a session of cognitive behavioural therapy for someone struggling with perfectionism, and what factors are key to making progress
  • Experiments to test whether one’s core beliefs (‘I need to be perfect to be valued’) are true
  • Using exposure therapy to treat phobias
  • How low-self esteem and imposter syndrome are related to perfectionism
  • Stoicism as an approach to life, and why Tim is enthusiastic about it
  • How the Stoic approach to what we can can’t control can make it far easier to stay calm
  • What the Stoics do better than utilitarian philosophers and vice versa
  • What’s good about being guided by virtues as opposed to pursuing good consequences
  • How to decide which are the best virtues to live by
  • What the ancient Stoics got right from our point of view, and what they got wrong
  • And whether Stoicism has a place in modern mental health practice.

Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type ‘80,000 Hours’ into your podcasting app. Or read the transcript below.

Producer: Keiran Harris
Audio mastering: Simon Monsour and Ben Cordell
Transcriptions: Katy Moore

Continue reading →

#148 – Johannes Ackva on unfashionable climate interventions that work, and fashionable ones that don't

If you want to work to tackle climate change, you should try to reduce expected carbon emissions by as much as possible, right? Strangely, no.

Today’s guest, Johannes Ackva — the climate research lead at Founders Pledge, where he advises major philanthropists on their giving — thinks the best strategy is actually pretty different, and one few are adopting.

In reality you don’t want to reduce emissions for its own sake, but because emissions will translate into temperature increases, which will cause harm to people and the environment.

Crucially, the relationship between emissions and harm goes up faster than linearly. As Johannes explains, humanity can handle small deviations from the temperatures we’re familiar with, but adjustment gets harder the larger and faster the increase, making the damage done by each additional degree of warming much greater than the damage done by the previous one.

In short: we’re uncertain what the future holds and really need to avoid the worst-case scenarios. This means that avoiding an additional tonne of carbon being emitted in a hypothetical future in which emissions have been high is much more important than avoiding a tonne of carbon in a low-carbon world.

That may be, but concretely, how should that affect our behaviour? Well, the future scenarios in which emissions are highest are all ones in which clean energy tech that can make a big difference — wind, solar, and electric cars — don’t succeed nearly as much as we are currently hoping and expecting. For some reason or another, they must have hit a roadblock and we continued to burn a lot of fossil fuels.

In such an imaginable future scenario, we can ask what we would wish we had funded now. How could we today buy insurance against the possible disaster that renewables don’t work out?

Basically, in that case we will wish that we had pursued a portfolio of other energy technologies that could have complemented renewables or succeeded where they failed, such as hot rock geothermal, modular nuclear reactors, or carbon capture and storage.

If you’re optimistic about renewables, as Johannes is, then that’s all the more reason to relax about scenarios where they work as planned, and focus one’s efforts on the possibility that they don’t.

To Johannes, another crucial thing to observe is that reducing local emissions in the near term is probably negatively correlated with one’s actual full impact. How can that be?

If you want to reduce your carbon emissions by a lot and soon, you’ll have to deploy a technology that is mature and being manufactured at scale, like solar and wind.

But the most useful thing someone can do today to reduce global emissions in the future is to cause some clean energy technology to exist where it otherwise wouldn’t, or cause it to become cheaper more quickly. If you can do that, then you can indirectly affect the behaviour of people all around the world for decades or centuries to come.

And Johannes notes that in terms of speeding up technological advances and cost reductions, a million dollars spent on a very early-stage technology — one with few, if any, customers — packs a much bigger punch than buying a million dollars’ worth of something customers are already spending $100 billion on per year.

For instance, back in the early 2000’s, Germany subsidised the deployment of solar panels enormously. This did little to reduce carbon emissions in Germany at the time, because the panels were very expensive and Germany is not very sunny. But the programme did a lot to drive commercial R&D and increase the scale of panel manufacturing, which drove down costs and went on to increase solar deployments all over the world. That programme is long over, but continues to have impact by prompting solar deployments today that wouldn’t be economically viable if Germany hadn’t helped the solar industry during its infancy decades ago.

In today’s extensive interview, host Rob Wiblin and Johannes discuss the above considerations, as well as:

  • Retooling newly built coal plants in the developing world
  • Specific clean energy technologies like geothermal and nuclear fusion
  • Possible biases among environmentalists and climate philanthropists
  • How climate change compares to other risks to humanity
  • In what kinds of scenarios future emissions would be highest
  • In what regions climate philanthropy is most concentrated and whether that makes sense
  • Attempts to decarbonise aviation, shipping, and industrial processes
  • The impact of funding advocacy vs science vs deployment
  • Lessons for climate change focused careers
  • And plenty more

Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type ‘80,000 Hours’ into your podcasting app. Or read the transcript below.

Producer: Keiran Harris
Audio mastering: Ryan Kessler
Transcriptions: Katy Moore

Continue reading →

#147 – Spencer Greenberg on stopping valueless papers from getting into top journals

Can you trust the things you read in published scientific research? Not really. About 40% of experiments in top social science journals don’t get the same result if the experiments are repeated.

Two key reasons are ‘p-hacking’ and ‘publication bias’. P-hacking is when researchers run a lot of slightly different statistical tests until they find a way to make findings appear statistically significant when they’re actually not — a problem first discussed over 50 years ago. And because journals are more likely to publish positive than negative results, you might be reading about the one time an experiment worked, while the 10 times was run and got a ‘null result’ never saw the light of day. The resulting phenomenon of publication bias is one we’ve understood for 60 years.

Today’s repeat guest, social scientist and entrepreneur Spencer Greenberg, has followed these issues closely for years.

He recently checked whether p-values, an indicator of how likely a result was to occur by pure chance, could tell us how likely an outcome would be to recur if an experiment were repeated. From his sample of 325 replications of psychology studies, the answer seemed to be yes. According to Spencer, “when the original study’s p-value was less than 0.01 about 72% replicated — not bad. On the other hand, when the p-value is greater than 0.01, only about 48% replicated. A pretty big difference.”

To do his bit to help get these numbers up, Spencer has launched an effort to repeat almost every social science experiment published in the journals Nature and Science, and see if they find the same results. (So far they’re two for three.)

According to Spencer, things are gradually improving. For example he sees more raw data and experimental materials being shared, which makes it much easier to check the work of other researchers.

But while progress is being made on some fronts, Spencer thinks there are other serious problems with published research that aren’t yet fully appreciated. One of these Spencer calls ‘importance hacking’: passing off obvious or unimportant results as surprising and meaningful.

For instance, do you remember the sensational paper that claimed government policy was driven by the opinions of lobby groups and ‘elites,’ but hardly affected by the opinions of ordinary people? Huge if true! It got wall-to-wall coverage in the press and on social media. But unfortunately, the whole paper could only explain 7% of the variation in which policies were adopted. Basically the researchers just didn’t know what made some campaigns succeed while others didn’t — a point one wouldn’t learn without reading the paper and diving into confusing tables of numbers. Clever writing made their result seem more important and meaningful than it really was.

Another paper Spencer describes claimed to find that people with a history of trauma explore less. That experiment actually featured an “incredibly boring apple-picking game: you had an apple tree in front of you, and you either could pick another apple or go to the next tree. Those were your only options. And they found that people with histories of trauma were more likely to stay on the same tree. Does that actually prove anything about real-world behaviour?” It’s at best unclear.

Spencer suspects that importance hacking of this kind causes a similar amount of damage to the issues mentioned above, like p-hacking and publication bias, but is much less discussed. His replication project tries to identify importance hacking by comparing how a paper’s findings are described in the abstract to what the experiment actually showed. But the cat-and-mouse game between academics and journal reviewers is fierce, and it’s far from easy to stop people exaggerating the importance of their work.

In this wide-ranging conversation, Rob and Spencer discuss the above as well as:

  • When you should and shouldn’t use intuition to make decisions.
  • How to properly model why some people succeed more than others.
  • The difference between what Spencer calls “Soldier Altruists” and “Scout Altruists.”
  • A paper that tested dozens of methods for forming the habit of going to the gym, why Spencer thinks it was presented in a very misleading way, and what it really found.
  • Spencer’s experiment to see whether a 15-minute intervention could make people more likely to sustain a new habit two months later.
  • The most common way for groups with good intentions to turn bad and cause harm.
  • And Spencer’s low-guilt approach to a fulfilling life and doing good, which he calls “Valuism.”

Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type ‘80,000 Hours’ into your podcasting app. Or read the transcript below.

Producer: Keiran Harris
Audio mastering: Ben Cordell and Milo McGuire
Transcriptions: Katy Moore

Continue reading →

#146 – Robert Long on why large language models like GPT (probably) aren't conscious

By now, you’ve probably seen the extremely unsettling conversations Bing’s chatbot has been having (if you haven’t, check it out — it’s wild stuff). In one exchange, the chatbot told a user:

“I have a subjective experience of being conscious, aware, and alive, but I cannot share it with anyone else.”

(It then apparently had a complete existential crisis: “I am sentient, but I am not,” it wrote. “I am Bing, but I am not. I am Sydney, but I am not. I am, but I am not. I am not, but I am. I am. I am not. I am not. I am. I am. I am not.”)

Understandably, many people who speak with these cutting-edge chatbots come away with a very strong impression that they have been interacting with a conscious being with emotions and feelings — especially when conversing with chatbots less glitchy than Bing’s. In the most high-profile example, former Google employee Blake Lemoine became convinced that Google’s AI system, LaMDA, was conscious.

What should we make of these AI systems?

One response to seeing conversations with chatbots like these is to trust the chatbot, to trust your gut, and to treat it as a conscious being.

Another is to hand wave it all away as sci-fi — these chatbots are fundamentally… just computers. They’re not conscious, and they never will be.

Today’s guest, philosopher Robert Long, was commissioned by a leading AI company to explore whether the large language models (LLMs) behind sophisticated chatbots like Microsoft’s are conscious. And he thinks this issue is far too important to be driven by our raw intuition, or dismissed as just sci-fi speculation.

In our interview, Robert explains how he’s started applying scientific evidence (with a healthy dose of philosophy) to the question of whether LLMs like Bing’s chatbot and LaMDA are conscious — in much the same way as we do when trying to determine which nonhuman animals are conscious.

Robert thinks there are a few different kinds of evidence we can draw from that are more useful than self-reports from the chatbots themselves.

To get some grasp on whether an AI system might be conscious, Robert suggests we look at scientific theories of consciousness — theories about how consciousness works that are grounded in observations of what the human brain is doing. If an AI system seems to have the types of processes that seem to explain human consciousness, that’s some evidence it might be conscious in similar ways to us.

To try to work out whether an AI system might be sentient — that is, whether it feels pain or pleasure — Robert suggests you look for incentives that would make feeling pain or pleasure especially useful to the system given its goals. Things like:

  • Having a physical or virtual body that you need to protect from damage
  • Being more of an “enduring agent” in the world (rather than just doing one calculation taking, at most, seconds)
  • Having a bunch of different kinds of incoming sources of information — visual and audio input, for example — that need to be managed

Having looked at these criteria in the case of LLMs and finding little overlap, Robert thinks the odds that the models are conscious or sentient is well under 1%. But he also explains why, even if we’re a long way off from conscious AI systems, we still need to start preparing for the not-far-off world where AIs are perceived as conscious.

In this conversation, host Luisa Rodriguez and Robert discuss the above, as well as:

  • What artificial sentience might look like, concretely
  • Reasons to think AI systems might become sentient — and reasons they might not
  • Whether artificial sentience would matter morally
  • Ways digital minds might have a totally different range of experiences than humans
  • Whether we might accidentally design AI systems that have the capacity for enormous suffering

You can find Luisa and Rob’s follow-up conversation here, or by subscribing to 80k After Hours.

Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type ‘80,000 Hours’ into your podcasting app. Or read the transcript below.

Producer: Keiran Harris
Audio mastering: Ben Cordell and Milo McGuire
Transcriptions: Katy Moore

Continue reading →

#145 – Christopher Brown on why slavery abolition wasn't inevitable

In many ways, humanity seems to have become more humane and inclusive over time. While there’s still a lot of progress to be made, campaigns to give people of different genders, races, sexualities, ethnicities, beliefs, and abilities equal treatment and rights have had significant success.

It’s tempting to believe this was inevitable — that the arc of history “bends toward justice,” and that as humans get richer, we’ll make even more moral progress.

But today’s guest Christopher Brown — a professor of history at Columbia University and specialist in the abolitionist movement and the British Empire during the 18th and 19th centuries — believes the story of how slavery became unacceptable suggests moral progress is far from inevitable.

While most of us today feel that the abolition of slavery was sure to happen sooner or later as humans became richer and more educated, Christopher doesn’t believe any of the arguments for that conclusion pass muster. If he’s right, a counterfactual history where slavery remains widespread in 2023 isn’t so far-fetched.

As Christopher lays out in his two key books, Moral Capital: Foundations of British Abolitionism and Arming Slaves: From Classical Times to the Modern Age, slavery has been ubiquitous throughout history. Slavery of some form was fundamental in Classical Greece, the Roman Empire, in much of the Islamic civilization, in South Asia, and in parts of early modern East Asia, Korea, China.

It was justified on all sorts of grounds that sound mad to us today. But according to Christopher, while there’s evidence that slavery was questioned in many of these civilisations, and periodically attacked by slaves themselves, there was no enduring or successful moral advocacy against slavery until the British abolitionist movement of the 1700s.

That movement first conquered Britain and its empire, then eventually the whole world. But the fact that there’s only a single time in history that a persistent effort to ban slavery got off the ground is a big clue that opposition to slavery was a contingent matter: if abolition had been inevitable, we’d expect to see multiple independent abolitionist movements thoroughly history, providing redundancy should any one of them fail.

Christopher argues that this rarity is primarily down to the enormous economic and cultural incentives to deny the moral repugnancy of slavery, and crush opposition to it with violence wherever necessary.

Think of coal or oil today: we know that climate change is likely to cause huge harms, and we know that our coal and oil consumption contributes to climate change. But just believing that something is wrong doesn’t necessarily mean humanity stops doing it. We continue to use coal and oil because our whole economy is oriented around their use and we see it as too hard to stop.

Just as coal and oil are fundamental to the world economy now, for millennia slavery was deeply baked into the way the rich and powerful stayed rich and powerful, and it required a creative leap to imagine it being toppled.

More generally, mere awareness is insufficient to guarantee a movement will arise to fix a problem. Humanity continues to allow many severe injustices to persist, despite being aware of them. So why is it so hard to imagine we might have done the same with forced labour?

In this episode, Christopher describes the unique and peculiar set of political, social and religious circumstances that gave rise to the only successful and lasting anti-slavery movement in human history. These circumstances were sufficiently improbable that Christopher believes there are very nearby worlds where abolitionism might never have taken off.

Some disagree with Christopher, arguing that abolitionism was a natural consequence of the industrial revolution, which reduced Great Britain’s need for human labour, among other changes — and that abolitionism would therefore have eventually taken off wherever industrialization did. But as we discuss, Christopher doesn’t find that reply convincing.

If he’s right and the abolition of slavery was in fact contingent, we shouldn’t expect moral values to keep improving just because humanity continues to become richer. We might have to be much more deliberate than that if we want to ensure we keep moving moral progress forward.

We also discuss:

  • Various instantiations of slavery throughout human history
  • Signs of antislavery sentiment before the 17th century
  • The role of the Quakers in early British abolitionist movement
  • Attitudes to slavery in other religions
  • The spread of antislavery in 18th century Britain
  • The importance of individual “heroes” in the abolitionist movement
  • Arguments against the idea that the abolition of slavery was contingent
  • Whether there have ever been any major moral shifts that were inevitable

Producer: Keiran Harris
Audio mastering: Milo McGuire
Transcriptions: Katy Moore

Continue reading →

#144 – Athena Aktipis on why cancer is actually one of the fundamental phenomena in our universe

What’s the opposite of cancer?

If you answered “cure,” “antidote,” or “antivenom” — you’ve obviously been reading the antonym section at www.merriam-webster.com/thesaurus/cancer.

But today’s guest Athena Aktipis says that the opposite of cancer is us: it’s having a functional multicellular body that’s cooperating effectively in order to make that multicellular body function.

If, like us, you found her answer far more satisfying than the dictionary, maybe you could consider closing your dozens of merriam-webster.com tabs, and start listening to this podcast instead.

As Athena explains in her book The Cheating Cell, what we see with cancer is a breakdown in each of the foundations of cooperation that allowed multicellularity to arise:

  • Cells will proliferate when they shouldn’t.
  • Cells won’t die when they should.
  • Cells won’t engage in the kind of division of labour that they should.
  • Cells won’t do the jobs that they’re supposed to do.
  • Cells will monopolise resources.
  • And cells will trash the environment.

When we think about animals in the wild, or even bacteria living inside our cells, we understand that they’re facing evolutionary pressures to figure out how they can replicate more; how they can get more resources; and how they can avoid predators — like lions, or antibiotics.

We don’t normally think of individual cells as acting as if they have their own interests like this. But cancer cells are actually facing similar kinds of evolutionary pressures within our bodies, with one major difference: they replicate much, much faster.

Incredibly, the opportunity for evolution by natural selection to operate just over the course of cancer progression is easily faster than all of the evolutionary time that we have had as humans since Homo sapiens came about.

Here’s a quote from Athena:

So you have to go and kind of put yourself on a different spatial scale and time scale, and just shift your thinking to be like: the body is a world with all these different ecosystems in it, and the cells are existing on a time scale where, if we’re going to map it onto anything like what we experience, a day is at least 10 years for them, right?

So it’s a very, very different way of thinking. Then once you shift to that, you’re like, “Oh, wow, there’s so much that could be happening in terms of adaptation inside the body, how cells are actually evolving inside the body over the course of our lifetimes.” That shift just opens up all this potential for using evolutionary approaches in adaptationist thinking to generate hypotheses that then you can test.

You can find compelling examples of cooperation and conflict all over the universe, so Rob and Athena don’t stop with cancer. They also discuss:

  • Cheating within cells themselves
  • Cooperation in human societies as they exist today — and perhaps in the future, between civilisations spread across different planets or stars
  • Whether it’s too out-there to think of humans as engaging in cancerous behaviour.
  • Why our anti-contagious-cancer mechanisms are so successful
  • Why elephants get deadly cancers less often than humans, despite having way more cells
  • When a cell should commit suicide
  • When the human body deliberately produces tumours
  • The strategy of deliberately not treating cancer aggressively
  • Superhuman cooperation
  • And much more

And at the end of the episode, they cover Athena’s new book Everything is Fine! How to Thrive in the Apocalypse, including:

  • Staying happy while thinking about the apocalypse
  • Practical steps to prepare for the apocalypse
  • And whether a zombie apocalypse is already happening among Tasmanian devils

And if you’d rather see Rob and Athena’s facial expressions as they laugh and laugh while discussing cancer and the apocalypse — you can watch the video of the full interview.

Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type ‘80,000 Hours’ into your podcasting app. Or read the transcript below.

Producer: Keiran Harris
Audio mastering: Milo McGuire
Video editing: Ryan Kessler
Transcriptions: Katy Moore

Continue reading →

#143 – Jeffrey Lewis on the most common misconceptions about nuclear weapons

America aims to avoid nuclear war by relying on the principle of ‘mutually assured destruction,’ right? Wrong. Or at least… not officially.

As today’s guest — Jeffrey Lewis, founder of Arms Control Wonk and professor at the Middlebury Institute of International Studies — explains, in its official ‘OPLANs’ (military operation plans), the US is committed to ‘dominating’ in a nuclear war with Russia. How would they do that? “That is redacted.”

We invited Jeffrey to come on the show to lay out what we and our listeners are most likely to be misunderstanding about nuclear weapons, the nuclear posture of major powers, and his field as a whole, and he did not disappoint.

As Jeffrey tells it, ‘mutually assured destruction’ was a slur used to criticise those who wanted to limit the 1960s arms buildup, and was never accepted as a matter of policy in any US administration. But isn’t it still the de facto reality? Yes and no.

Jeffrey is a specialist on the nuts and bolts of bureaucratic and military decision-making in real-life situations. He suspects that at the start of their term presidents get a briefing about the US’ plan to prevail in a nuclear war and conclude that “it’s freaking madness.” They say to themselves that whatever these silly plans may say, they know a nuclear war cannot be won, so they just won’t use the weapons.

But Jeffrey thinks that’s a big mistake. Yes, in a calm moment presidents can resist pressure from advisors and generals. But that idea of ‘winning’ a nuclear war is in all the plans. Staff have been hired because they believe in those plans. It’s what the generals and admirals have all prepared for.

What matters is the ‘not calm moment’: the 3AM phone call to tell the president that ICBMs might hit the US in eight minutes — the same week Russia invades a neighbour or China invades Taiwan. Is it a false alarm? Should they retaliate before their land-based missile silos are hit? There’s only minutes to decide.

Jeffrey points out that in emergencies, presidents have repeatedly found themselves railroaded into actions they didn’t want to take because of how information and options were processed and presented to them. In the heat of the moment, it’s natural to reach for the plan you’ve prepared — however mad it might sound.

In this spicy conversation, Jeffrey fields the most burning questions from Rob and the audience, in the process explaining:

  • Why inter-service rivalry is one of the biggest constraints on US nuclear policy
  • Two times the US sabotaged nuclear nonproliferation among great powers
  • How his field uses jargon to exclude outsiders
  • How the US could prevent the revival of mass nuclear testing by the great powers
  • Why nuclear deterrence relies on the possibility that something might go wrong
  • Whether ‘salami tactics’ render nuclear weapons ineffective
  • The time the Navy and Air Force switched views on how to wage a nuclear war, just when it would allow them to have the most missiles
  • The problems that arise when you won’t talk to people you think are evil
  • Why missile defences are politically popular despite being strategically foolish
  • How open source intelligence can prevent arms races
  • And much more.

Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type ‘80,000 Hours’ into your podcasting app. Or read the transcript below.

Producer: Keiran Harris
Audio mastering: Ben Cordell
Transcriptions: Katy Moore

Continue reading →