Want to upskill in technical AI safety? Here are 67 useful resources
Are you enthusiastic about technical AI safety but need concrete ideas for how to enter the field?
Below are our top picks for upskilling in technical AI safety research, the field focused on ensuring powerful AI systems behave safely and as intended. In practice, upskilling involves developing the machine learning and research skills needed to work on challenges such as alignment and interpretability.
We developed this list in consultation with our advisors to highlight the resources they most commonly recommend, including articles, courses, organisations, and fellowships. While we recommend applying to speak to an advisor for one-on-one tailored guidance, this page gives a practical, non-comprehensive snapshot of how you might move from being interested in technical AI safety to starting to work on it.
Overviews
These resources outline the technical AI safety landscape, highlighting current research efforts and some practical ways to begin contributing to the field.
- AISafety.com
- Shallow Review of Technical AI Safety by technicalities, Tomáš Gavenčiak, Stephen McAleese, et al.
- AI Safety Technical Research Career Guide - How to Enter by 80,000 Hours
- Levelling Up in AI Safety Research Engineering by Gabriel Mukobi
- Recommendations for Technical AI Safety Research Agendas by Anthropic
- Technical AI Safety Research Areas by Open Philanthropy
- An overview of areas of control work by Ryan Greenblatt, Redwood Research
- AI Safety Needs Great Engineers by Andy Jones
AI safety courses
These courses can help you gain technical knowledge and practical research experience in AI safety.
- ARENA’s curriculum
- BlueDot Impact’s AI Alignment course
- Andrej Karpathy’s Zero to Hero course
- His YouTube videos can also be great intro-friendly resources, as can 3Blue1Brown’s deep learning videos.
- Deep Learning Curriculum by Jacob Hilton
- Google ML Course
Ideas for projects and upskilling
If you’re looking for concrete ways to contribute to technical AI safety research, check out these resources:
- “What are some projects I can try?” (AISafety.Info)
- 100+ concrete projects and open problems in evals by Marius Hobbhahn
- A list of 45+ Mech Interp Projects by Apollo Research
- Open Problems in Mechanistic Interpretability by Sharkey et al.
- Consider joining an alignment hackathon such as an Apart Research Sprint.
- Consider joining Eleuther’s community of researchers on their Discord.
- Consider writing a task using the METR framework.
- Consider writing your research theory of change (workshop slides, Michael Aird).
Advice from technical AI safety researchers
Many experts in the field have practical tips for getting involved in technical AI safety work. Here is some of our favourite advice:
- Karpathy on PhDs, research agendas, career advice
- Ethan Perez’s Tips for Empirical Alignment Research and Workflows
- Neel Nanda’s Highly Opinionated Advice on How to Write ML Papers
- Neel Nanda’s How to Become a Mechanistic Interpretability Researcher
- Gabriel Mukobi’s ML Safety Research Advice
- Richard Ngo’s AGI Safety Career Advice
- Marius Hobbhahn’s Advice for Independent Research
- Adam Gleave on whether to do a PhD
- Lewis Hammond’s Advice (for doing a PhD (in AI (Safety)))
Fellowships in alignment and technical safety
If you’re looking to break into technical AI safety, these programmes offer structured support with mentorship, funding, and access to active researchers.
- Anthropic AI Safety Fellowship
- MATS
- Constellation Astra Fellowship and other programs
- LASR Labs
- SPAR
- Pivotal
- ARENA
- Cambridge ERA AI
- CHAI Research Fellowship + Internship
- Global AI Safety Fellowship
Technical AI safety organisations
A growing ecosystem of organisations is tackling technical AI safety from multiple angles. The list below highlights some key players. A larger list is available here: Overview of the AI safety ecosystem
- AI Alignment Research Center
- Apollo
- Center for Human-Compatible AI
- Conjecture
- FAR AI
- Goodfire
- METR
- Palisade
- Redwood Research
Key descriptions of the alignment problem
These articles offer perspectives on the alignment problem and help explain why many researchers see it as a pressing technical issue.
- Ajeya Cotra – Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover
- Paul Christiano – What failure looks like
- Richard Ngo – The alignment problem from a deep learning perspective
- Joe Carlsmith – Is Power-Seeking AI an Existential Risk?
Staying up to date with AI (podcasts, newsletters, etc)
Here are our top recommendations for keeping up with the latest developments and debates in the field.
- Follow top recent papers on the Alignment Forum, Zvi Mowshowitz’s Substack, the AI Safety Newsletter (CAIS), or PapersWithCode.
- Alignment Workshop videos from FAR.AI
- The 80,000 Hours Podcast
- Dwarkesh Podcast
- The Cognitive Revolution Podcast
- AI X-Risk Podcast
Want one-on-one advice on pursuing technical AI safety careers?
We think the risks from AI could be the most pressing problems the world currently faces. If you think you might be a good fit for a career path that contributes to solving this problem, we’d be especially excited to advise you on next steps, one-on-one.
We can help you consider your options, make connections with others working on reducing risks from AI, and possibly even help you find jobs or funding opportunities — all for free.