11 essential readings on AI safety, risk, and alignment

If you’re overwhelmed trying to understand how advanced AI could impact our lives, you’re not alone. There’s a lot of noise out there.

We’ve boiled it down to 11 essential resources for getting up to speed on the risks of AGI.

We’ve chosen these because they represent the most influential ideas shaping the current debate. Though we don’t agree with everything the authors say, we think they’re well worth reading.

AI risk reading list

  1. Preparing for the Intelligence Explosion by William MacAskill and Fin Moorhouse
  2. AI 2027 by Daniel Kokotajlo, Scott Alexander, Thomas Larsen, Eli Lifland, and Romeo Dean
  3. Situational Awareness: The Decade Ahead by Leopold Aschenbrenner
  4. The case for multi-decade AI timelines by Ege Erdil
  5. The Most Important Century by Holden Karnofsky
  6. Does AI Progress Have a Speed Limit? by Ajeya Cotra and Arvind Narayanan
  7. Can AI scaling continue through 2030? by Jaime Sevilla et al.
  8. Is Power-Seeking AI an Existential Risk? by Joe Carlsmith
  9. Gradual Disempowerment by Jan Kulveit, Raymond Douglas, Nora Ammann, Deger Turan, David Krueger, and David Duvenaud
  10. Taking AI Welfare Seriously by Robert Long, Jeff Sebo, et al.
  11. Machines of Loving Grace — How AI Could Transform the World for the Better by Dario Amodei

Summary of each resource

1. Preparing for the Intelligence Explosion by William MacAskill and Fin Moorhouse at Forethought Research (March 2025)

Researchers at Forethought argue that an “intelligence explosion” could compress a century of technological progress into a decade, creating numerous grand challenges that humanity must prepare for now. You can listen to Will MacAskill discuss this piece on our podcast.

2. AI 2027 by Daniel Kokotajlo, Scott Alexander, Thomas Larsen, Eli Lifland, and Romeo Dean (April 2025)

An analysis of a concrete scenario in which AGI arrives soon via the automation of AI research. The AI 2027 team also provides its own forecasts of several key outcomes in the accompanying research. Many people (including us) think it’s unlikely things will unfold this fast, but in any case it has become one of the most discussed pieces of research in the field.

3. Situational Awareness: The Decade Ahead by Leopold Aschenbrenner (June 2024)

Former OpenAI employee Leopold Aschenbrenner makes a compelling case — across five in-depth chapters — that AGI is coming much sooner than many expect, and few realise just how much it will change the world. We think this piece might underplay the challenge of aligning AGI with human interests and the need for international coordination on AI risks. However, its quantitative predictions about AI development have roughly borne out so far.

4. The case for multi-decade AI timelines by Ege Erdil from Epoch AI (April 2025)

Researcher Ege Erdil makes one of the most influential arguments against the idea that we may have AGI by 2030 — including his doubts about the prospect of a rapid ‘intelligence explosion,’ and why he expects current revenue trends in AI development to slow down. Ege and Tamay Besiroglu discussed these ideas on the Dwarkesh podcast.

5. The Most Important Century by Holden Karnofsky and other authors (2021)

Holden Karnofsky’s series from 2021 argues that transformative AI could make the upcoming decades the most important in history. Some of it is now out of date, but contains several useful articles including How we could stumble into AI catastrophe, AI could defeat all of us combined, Why AI alignment could be hard with modern deep learning by guest author Ajeya Cotra, and Jobs for helping with the most important century.

6. Does AI Progress Have a Speed Limit? by Ajeya Cotra and Arvind Narayanan (April 2025)

Experts Ajeya Cotra and Arvind Narayanan discuss the factors behind the pace of AI development. They present contrasting views about the likely speed of progress in AI and its societal effects, offering useful insights into the state of the debate.

7. Can AI scaling continue through 2030? by Jaime Sevilla et al. at Epoch AI (August 2024)

Projections by researchers at Epoch suggest AI companies can continue scaling their systems through 2030, primarily facing constraints in power availability and chip manufacturing capacity.

8. Is Power-Seeking AI an Existential Risk? by Joe Carlsmith (June 2022)

Joe Carlsmith’s ‘Is Power-Seeking AI an Existential Risk?’ is one of the central papers putting together the argument that extremely capable AI systems could pose an existential threat to humanity. The idea: future AIs might be motivated to disempower humans — and they could become smart enough to succeed.

9. Gradual Disempowerment by Jan Kulveit, Raymond Douglas, Nora Ammann, Deger Turan, David Krueger, and David Duvenaud (January 2025)

‘Gradual Disempowerment’ argues that even if we avoid the risks of power-seeking and scheming AIs, there may be other ways AI systems could disempower humanity. Our political, economic, and cultural systems might slowly drift away from serving human interests in a world with advanced AI.

10. Taking AI Welfare Seriously by Robert Long, Jeff Sebo, et al. (November 2024)

“Taking AI Welfare Seriously” argues that there’s a realistic possibility some AIs will be conscious in the near future. As the authors explain, this means we shouldn’t only be worrying about the risks AI poses to humanity — we potentially need to consider the interests and welfare of future AI systems as well.

11. Machines of Loving Grace — How AI Could Transform the World for the Better by Anthropic CEO Dario Amodei (October 2024)

It’s important to understand why there’s enthusiasm for building powerful AI systems, despite the risks. In ‘Machines of Loving Grace,’ the CEO of Anthropic — the AI company behind Claude — attempts to paint a positive vision for powerful AI.

Other reading lists by topic

Here are additional lists of resources we’ve put together for specific AI-related topics.

For more useful resources, check out the overview of what’s happening with AGI on our site.