AI safety technical research
In a nutshell: To mitigate the risks posed by the development of artificial intelligence, it’s imperative to research how to solve technical challenges and design problems to ensure that powerful AI systems do what we want — and are beneficial — without any catastrophic unintended consequences.
If you are well suited to this career, it may be the best way for you to have a social impact.
Based on a medium-depth investigation
Table of Contents
Why might working on AI safety research be high impact?
As we’ve argued, in the next few decades, we might see the development of powerful machine learning algorithms with the potential to transform society. This could have major upsides and downsides, including the possibility of catastrophic risks.
Besides strategy and policy work discussed in this career review, another key way to limit these risks is research into the technical challenges raised by powerful AI systems, such as the alignment problem. In short, how do we design powerful AI systems so they’ll do what we want, and not have unintended consequences?
This field of research has started to take off, and there are now major academic centres and AI labs where you can work on these issues, such as Mila in Montreal, the Future of Humanity Institute at Oxford, the Center for Human-Compatible Artificial Intelligence at Berkeley, DeepMind in London, and OpenAI in San Francisco. We’ve advised over 100 people on this path, with several already working at the above institutions. The Machine Intelligence Research Institute in Berkeley has been working in this area since 2005 and has an unconventional perspective and research agenda relative to the other labs.
There is plenty of funding available for talented researchers — including academic grants and philanthropic donations from major grantmakers like Open Philanthropy. It’s also possible to get funding for your PhD programme. The main need of the field is more people capable of using this funding to carry out the research.
What does this path involve?
In this path, the aim is to get a position at one of the top AI safety research centres — either in industry, nonprofits, or academia — and then try to work on the most pressing questions, with the eventual aim of becoming a research lead overseeing safety research.
Broadly, AI safety technical positions can be divided into (i) research and (ii) engineering. Researchers direct the research programme. Engineers create the systems and do the analysis needed to carry out the research.
Although engineers have less influence over the high-level research goals, it is still important that engineers are concerned about safety, as they’ll better understand the ultimate goals of the research (and so prioritise better), be more motivated, shift the culture towards safety, and use the career capital they gain to benefit other safety projects in the future. This means that engineering can be a good alternative for those who don’t want to be a research scientist.
It can also be useful to have people who understand the challenges of AI safety working in AI research teams that aren’t directly focused on AI safety. Working on these teams can put you in a position to help promote concern for safety in general, especially if you end up in a management position with influence over the organisation’s priorities.
We’d also be excited to see more people build expertise to do AI safety work in or related to China — read more in our career review on China-related AI safety and governance paths, some of which take the form of technical research.
Examples of people pursuing this path
How to assess your fit
The most impactful AI technical safety research will probably be done by people in the top jobs listed earlier. So to decide if this path is a good fit for you, it’s important to consider whether you have a reasonable chance of getting those jobs.
- Do you have a chance of getting into a top five graduate school in machine learning? This can be a good test for whether you could get a job at a top AI research centre, though it’s not a requirement.
- Are you convinced of the importance of long-term AI safety?
- Are you a software or machine learning engineer who’s worked at FAANG and other competitive companies? You may be able to train to enter a research position, or otherwise take an engineering position.
- Do you have a chance at making a contribution to a relevant research question? For instance, are you highly interested in the topic, have ideas for questions to look into, and can’t resist pursuing them? Read more about how to tell if you’re a good fit for working in research.
How to enter this field
The first step on this path is usually to pursue a PhD in machine learning at a good school. It’s possible to enter this field without a PhD, but it’s likely to be required in research roles at academic centres and DeepMind, which make up a large fraction of the best positions. A PhD in machine learning also opens up options in AI policy, applied AI, and earning to give, so this path has good backup options if you later decide AI technical safety isn’t for you.
However, if you want to pursue engineering over research, then the PhD is not necessary. Instead, you can do a master’s programme or train up in industry.
It’s also possible to enter this path from neuroscience (especially computational neuroscience), so if you already have a background in that area, you may not have to return to study.
If you have a lot of familiarity already with AI safety as a problem area, our top recommendation is to look at this step-by-step guide to pursuing a career in technical AI safety, by Charlie Rogers-Smith.
Recently, opportunities have also opened up for social scientists to contribute to AI safety.
You can find much more detail in the resources listed below.
- AI Safety Support works to reduce existential and catastrophic risk from AI by supporting everyone who wants to work on this problem, with a focus on helping new and aspiring AI safety researchers through career advice and community building.
- Alignment Research Center is a nonprofit research organisation working to align future machine learning systems with human interests. Its current work focuses on developing an ‘end-to-end’ alignment strategy that could be adopted in industry today while scaling gracefully to future machine learning systems. See current vacancies.
- Anthropic is an AI safety and research company that’s working to build reliable, interpretable, and steerable AI systems. Their multidisciplinary team’s research interests include natural language, human feedback, scaling laws, reinforcement learning, code generation, and interpretability. See current vacancies.
- The Center for Human-Compatible Artificial Intelligence aims to develop the conceptual and technical wherewithal to reorient the general thrust of AI research towards provably beneficial systems. See current vacancies.
- The Center on Long-term Risk addresses worst-case risks from the development and deployment of advanced AI systems. It is currently focused on conflict scenarios as well as technical and philosophical aspects of cooperation. Their work includes conducting interdisciplinary research, making and recommending grants, and building a community of professionals and other researchers around these priorities. See current vacancies.
- DeepMind is probably the largest research group developing general machine intelligence in the Western world. We’re only confident about recommending DeepMind roles working specifically on safety, ethics, policy, and security issues. See current vacancies.
- The Future of Humanity Institute is a multidisciplinary research institute at the University of Oxford. Academics at FHI bring the tools of mathematics, philosophy, and social sciences to bear on big-picture questions about humanity and its prospects.
- The Machine Intelligence Research Institute was one of the first groups to become concerned about the risks from machine intelligence in the early 2000s, and has published a number of papers on safety issues and how to resolve them. See current vacancies.
- OpenAI was founded in 2015 with the goal of conducting research into how to make AI safe. It has received over $1 billion in funding commitments from the technology community. We’re only confident in recommending opportunities in their policy, safety, and security teams. See current vacancies.
- Redwood Research conducts applied research to help align future AI systems with human interests. See current vacancies.
Want one-on-one advice on pursuing this path?
Because this is one of our priority paths, if you think this path might be a great option for you, we’d be especially excited to advise you on next steps, one-on-one. We can help you consider your options, make connections with others working in the same field, and possibly even help you find jobs or funding opportunities.
Find jobs as an AI safety researcher
If you think you might be a good fit for this path and you’re ready to start looking at job opportunities, see our curated list of jobs open in this path:
Key further reading:
- To help you get oriented in the field, we recommend the AI safety starter pack.
- Charlie Rogers-Smith’s step-by-step guide to AI safety careers
- Our problem profile on AI risk
- This curriculum on AI safety (or, for something shorter, this sequence of posts by Richard Ngo)
- Our guide to becoming an ML engineer focused on AI safety
- Machine learning PhD career review
- Our AI technical safety career review from 2015
- Reading list from the Center for Human-Compatible AI
- A collection of reading lists about AI safety
- See all of our articles on AI safety careers
- Dr Paul Christiano on how OpenAI is developing real solutions to the ‘AI alignment problem’, and his vision of how humanity will progressively hand over decision-making to AI systems
- Machine learning engineering for AI safety and robustness: a Google Brain engineer’s guide to entering the field
- The world needs AI researchers. Here’s how to become one
- Chris Olah on working at top AI labs without an undergrad degree and What the hell is going on inside neural networks
- A machine learning alignment researcher on how to become a machine learning alignment researcher
- Richard Ngo on large language models, OpenAI, and striving to make the future go well
Read next: Learn about other high-impact careers
Want to consider more paths? See our list of the highest-impact career paths according to our research.
Plus, join our newsletter and we’ll mail you a free book
Join our newsletter and we’ll send you a free copy of The Precipice — a book by philosopher Toby Ord about how to tackle the greatest threats facing humanity.