AI safety syllabus
Update April 2024: This syllabus was written in August 2016. The field of AI safety has progressed substantially since then. If you’re looking for up-to-date resources, we recommend:
- The curriculum from the alignment course or the governance-focused course, both from AI Safety Fundamentals
- Our list of resources to learn more in our full problem profile on preventing AI-related catastrophe
- These resources collected by the team at AI Safety Support
This page was written by Jan Leike, with contributions and comments by David Krueger, Jelena Luketina, Victoria Krakovna, Daniel Dewey, Laurent Orseau, and others. It is intended as a guide to working on technical aspects of AI safety. See our guide to working in AI policy and strategy for another approach.
This is a syllabus of relevant background reading material and courses related to AI safety. It is intended as a guide for undergraduates in mathematics and computer science planning their degree, as well as people from other disciplines who are thinking about moving into AI safety. It includes tips how to design your degree, how to transition into research, and the relevant conferences. This is not intended as a general guide of how to become a researcher.
Table of Contents
Want to work on AI safety? We want to help.
We’ve helped dozens of people formulate their plans, and put them in touch with academic mentors. If you want to work on AI safety:
Reading List
We now recommend using the bibliography from the Center for Human-Compatible AI at UC Berkeley. Their list is more comprehensive and up-to-date than the one below.
This is a list of the most relevant reading topics and the appropriate material. The chapter recommendations are indicative of what you should know. If you find the topic interesting, read more! As an undergraduate student, you can plan these courses into your degree. As a graduate student, you can use the provided material to extend your knowledge into areas that you do not have much background in. Focus on the textbooks and lecture notes and use the video lectures as supplementary materials. Doing plenty of exercises is usually a good idea to make sure that you actually understand a topic instead of just thinking you understand it.
Some of the relevant areas might not be offered as courses by your university. You can always read the listed books in your free time, or try to find a MOOC on Coursera.
- Machine Learning is the modern probabilistic approach to artificial intelligence. It studies algorithms that learn to predict from (usually independent and identically distributed) data.
- Textbooks:
- Bishop: Pattern Recognition and Machine Learning
- Murphy: Machine Learning (alternative to Bishop)
- Gelman and Rubin: Bayesian Data Analysis Chapters 2,4,5
- McKay: Information Theory, Inference, and Learning Algorithms Parts I-III
- Video lectures:
- Lecture Notes:
- Textbooks:
- Statistics is the related mathematical discipline. Having a solid understanding of the underlying mathematics is very useful when doing theoretical work.
- Textbooks:
- Wasserman: All of Statistics, Chapters 1-12 or more (an easy-to-read overview of the field). Editor’s note: Others have noted that Chapters 1-4, 9 and 13 are the most relevant.
- Durrett: Probability: Theory and Examples, Chapters 1-6 (for the more mathematically inclined reader)
- Video Lectures:
- Introduction to Probability (MITx)
- Textbooks:
- Reinforcement Learning is a subdiscipline of machine learning that studies algorithms that learn to act in an unknown environment through trial and error: the algorithm only receives a reward signal in form of a numeric feedback how well it is currently doing and tries to maximize this signal. Reinforcement learning is currently the most promising approach for general intelligence.
- Textbooks:
- Sutton and Barto: Reinforcement Learning, Chapters 1-6 & 8
- Hutter: Universal Artificial Intelligence, Chapters 2-5 (high-level theoretical perspective)
- Video lectures:
- Exercises:
- Relevant papers:
- Textbooks:
- Deep Learning is a recently successful approach to machine learning based on neural networks with many layers. Convolutional and recurrent neural networks have enabled recent breakthroughs in computer vision, speech recognition, and other domains. At the moment the field is moving extremely fast and textbooks are likely to become obsolete quickly.
- Textbooks:
- Goodfellow, Bengio, and Courville: Deep Learning, Chapters 1-12
- Nielsen: Neural Networks and Deep Learning, Chapters 1-6 (more entry-level)
- Video lectures:
- Survey papers:
- Textbooks:
- Artificial Intelligence encompasses the classical approaches: logic, planning, knowledge representation and reasoning. Probabilistic approaches to AI are covered in the other topics above.
- Textbooks:
- Russell and Norvig: Artificial Intelligence: A Modern Approach, Chapters 1-17 (very comprehensive, has been the standard textbook for many years)
- Video lectures:
- Textbooks:
- Game Theory provides the foundations to environments with multiple agents.
- Textbooks:
- Osborne: An Introduction to Game Theory, Chapters 1-7,14,15
- Video lectures:
- Jackson, Leyton-Brown, Shoham: Game theory I and II
- Survey papers:
- Textbooks:
- Philosophy originated some of the mental tools that are useful when thinking about superintelligent agents and originated the discussion around AI safety.
More remotely related are the following areas.
- Automata and Complexity Theory sits at the core of theoretical computer science.
- Textbooks:
- Ullman and Hopcroft: Introduction to Automata Theory, Languages, and Computation, Chapters 1-10
- Sipser: Introduction to the Theory of Computation, Chapters 1-5,7 (alternatively to Ullman and Hopcroft)
- Textbooks:
- Formal Logic is a branch of mathematics that studies formal proofs and the limits of mathematics.
- Textbooks:
- Boolos and Burgess: Computability and Logic, Chapters 1-4,8-20,23,25,27
- Enderton: A Mathematical Introduction to Logic, Chapters 0-3 (alternatively to Boolos and Burgess)
- Textbooks:
- Formal Methods are mathematical techniques for the specification and verification of hardware and software systems.
- Textbooks:
- Alur: Principles of Cyber-Physical Systems, Chapters 1-7,9
- Monin: Understanding Formal Methods, Chapters 1-10
- Baier, Katoen, Larsen: Principles of Model Checking, Chapter 1-7,10
- Surveys:
- Clarke, Henzinger, Veith: Handbook of Model Checking (to appear soon)
- Textbooks:
Degree
Undergraduate Degree
Ideally your undergraduate degree would be mathematics and computer science (for example, a bachelor’s degree in math and a master in computer science). But this does not mean that an undergraduate degree in a different related discipline like neuroscience or physics would be wasted. Make sure you have a solid handle on the relevant mathematics (linear algebra, calculus, statistics, …)!
For your undergraduate thesis, find someone who supervises well and who has time for you (not the most famous/cool professor). Work on a topic that your supervisor finds interesting (to get lots of feedback). Pursuing your own ideas at this point is risky and usually means that you don’t get much supervision. Do something theoretical, preferably in computer science. Find an interesting research group and start doing research early in your degree (it helps a lot if you have clever things to say about their research). Ideally, you should get out of a Master’s degree with at least one publication at an international conference. It’s not a big deal if this delays your degree.
Other tips:
- If you find a topic interesting, take more classes even if they don’t seem related
- Choose harder classes over easier ones (favor math courses and theoretical computer science courses over applied computer science courses)
- Choose your thesis by supervisor, and not necessarily by topic
- Publications are great, they are a considered a good predictor of your academic potential (even if you are not the first author). As such, they are very helpful when applying for PhD programs
- Read general advice on whether a PhD is for you and how to approach it
- Attend MIRIx workshops if they exist in your area
PhD
Getting a PhD is generally an excellent idea and usually a prerequisite for someone to hire you as a researcher. A PhD will not only put you at the cutting edge of research, but also teach you the relevant soft skills (how to write papers, communicate complex ideas, etc.).
Your PhD should be in machine learning, reinforcement learning, statistics, or another discipline related to artificial intelligence. Focus on getting the required expertise first. If you feel comfortable in your area, shift your focus on to AI safety (e.g., in your final 1-2 years). Read our profile on machine learning PhDs for more information.
For relevant problems, see:
- Amodei et al: Concrete Problems in AI Safety
- Soares and Fallenstein: Aligning Superintelligence with Human Interests
- Russell et al: Research Priorities for Robust and Beneficial Artificial Intelligence
- Taylor et al: Alignment for Advanced Machine Learning Systems
Google AI residency program
The Google AI residency program is a year-long role, similar to spending a year in a master’s or PhD program in deep learning.
It’s designed to quickly get you up to speed with deep learning research and is open to people with degrees in a STEM field (bachelor’s, master’s, or PhD). It’s more prestigious than a master’s degree and gives you access to Google’s computational resources and experts in deep learning. That said, it’s extremely competitive – you’re more likely to get accepted into a top graduate school programme.
It’s worth applying to it after both undergraduate and master’s. If you’re choosing between the residency and a master’s, the residency will usually be better because of the advantages mentioned above, as well as the fact that you’ll be spending all your time on research.
When choosing between the residency and a PhD you’ll need to consider how good your PhD offers are – if you’ve got offers from top places then it may not be worth postponing, especially if you can’t defer your PhD.
Research Groups
The following is an non-exhaustive list of research groups where you could apply for internships and PhD candidacy. Make sure you look at their research and see how it relates to your interests. Needless to say, it is not a good idea to mass-email everyone on this list.
- Berkeley (Stuart Russell)
- Cambridge (Zoubin Ghahramani)
- University of Montreal (MILA)
- University of Alberta (RLAI group)
- Imperial College London (Murray Shanahan)
- University of Oxford (Michael Osborne)
- McGill (RLLAB)
- Australian National University (Marcus Hutter)
- University of Amsterdam (Max Welling)
- Stanford (Percy Liang, Emma Brunskill)
- CMU (Zico Kolter)
- University of Toronto
- IDSIA (Jürgen Schmidhuber)
- Princeton (HIPS)
- MIT (Joshua Tenenbaum)
- Google DeepMind
- Google Brain
- FAIR
- OpenAI
Conferences
For your publications, always aim for the best conferences, even if you think your work will be rejected. Even if it is rejected, you will likely get more valuable feedback than in other places.
Attend major conferences even if you don’t have a paper there. You will get a sense of what researchers are interested in, and you can connect to potential supervisors and collaborators related to your interests. Read some of their papers beforehand so that you have a good conversation starter.
Major: ICML, NIPS, COLT, AAAI, UAI, IJCAI, AAMAS, ICLR
Minor: AISTATS, ECAI, ECML, ALT
Applications: ICCV, CVPR (Computer vision), ICASSP (Speech), ICRA (Robotics), EMNLP, ACL (NLP)
Want to work on AI safety? We want to help.
We’ve helped dozens of people formulate their plans, and put them in touch with academic mentors. If you want to work on AI safety: