Machine Learning PhDs

By Richard Batty · Published June 2017

Machine Learning PhDs

By Richard Batty · Published June 2017

AI is improving rapidly. Here’s how to learn the technology that’s driving it.

Go, an ancient Chinese game that’s notorious for its strategic difficulty, is a critical milestone for artificial intelligence. Computers have been beating people at chess since 1997 by efficiently searching through possible moves. But Go is a whole different beast; to win, a computer has to think more like a person using their intuition.

So when the AlphaGo program designed by DeepMind beat Lee Sedol, one of the world’s best Go players, in 2016, it was astonishing. Even more astonishing, it played in a way that was surprising and original¹ to experienced Go players.

What was behind this breakthrough? It was a machine learning technique called deep learning, which is loosely inspired by the network structure of our brains. Machine learning’s successes are not limited to game playing – it’s been succeeding at many tasks including driving, language translation, and speech recognition.

Now, everyone wants to get in on the machine learning action; the field has become wildly popular in recent years.

Does it live up to the hype? In part, yes – its successes demonstrate its usefulness.

So, if you have a quantitative background (not necessarily in computer science), and want to have a positive impact on the world, we think machine learning is one of the best PhD programmes. It gives you the skills to use and shape this powerful technology for the benefit of humanity. As a back-up, it opens up lots of high-paying positions in industry, allowing you to earn to give.

In the rest of this profile, we explain why it’s a high-impact area, how to work out if it’s for you, and exactly how and where to apply.

Summary

A machine learning PhD catapults you into a field of critical importance for humanity’s future. You can use the skills you gain to help positively shape the development of artificial intelligence, apply machine learning techniques to other pressing global problems, or, as a fall-back, earn money and donate it to highly effective charities. It’s open to people who have studied a quantitative subject, even if they haven’t done computer science before.

Pros

• Potential for a large impact from your research
• Build skills in what's plausibly one of the most important technologies of the coming decade
• High earning potential after graduation
• Intellectually stimulating work with capable colleagues

Cons

• Takes 4-6 years, with relatively low pay
• Requires a lot of work alone without much feedback, which makes it demotivating for many
• Some risk that the area is overhyped and that it will become harder to get jobs in the future

Key facts on fit

Strong maths skills (equivalent to having a undergraduate degree in a quantitative subject), want to do high-level research.

Next steps

If you haven’t done machine learning before, take this online course or a course at your university. Then try out machine learning research during your undergraduate or master’s degree, or by taking summer research internship.

If you’re interested in using a machine learning PhD to work on AI safety, apply for our free coaching service.

1 What this profile is based on
2 What is this career path?
3 What does the work involve? What is it like day-to-day?
4 If you want to have a positive impact, why do a machine learning PhD?
5 Downsides
6 How do you get in?
- 6.1 How to show you can do good research
7 How to work out whether the PhD would be right for you
8 How to choose your programme
- 8.1 Which research group and institution?
- 8.2 Which topic?
9 Get industry experience during your PhD
10 How to get machine learning experience while doing a non-machine learning PhD
11 Learn more
- 11.1 Top recommendations
12 Want to use an ML PhD to make the world a better place? We want to help.

What this profile is based on

Most of the information in this profile comes from talking to several people who did PhDs in machine learning, including research scientists at DeepMind, the co-founder of a robotics start-up with a computer vision PhD, and PhD students at top university departments. We also drew heavily on this online guide to computer science PhDs by a professor at one of the top computer science departments.

What is this career path?

Normally we have to tell computers exactly what to do, step-by-step. But this leaves them unable to perform tasks where it’s difficult to specify the steps accurately in advance.

By contrast, in the field of machine learning, the programmer chooses the rules which govern how the software learns, rather than directly programming its behaviour. This means we can build systems that automatically improve with experience.²

In a machine learning PhD you learn how to design and implement these kinds of algorithms. Your PhD research could cover topics like creating a program that can label what’s going on in a video;³ improving techniques to understand why machine learning systems make the predictions they do;⁴ or analysing online text to understand social processes such as how online slang spreads.⁵

You can find more examples of research projects by looking at department websites (e.g. Stanford’s), lists of previous dissertations (e.g. from Toronto University and Carnegie Mellon University), and 80,000 Hours’ list of promising example research ideas.

Machine learning is a subfield of computer science and is closely related to statistics. Both statistics and machine learning have the aim of learning from data and they share many concepts and mathematical tools.

But, unlike statistics, machine learning tends to emphasise building software to make predictions, is often applied to larger datasets, and the techniques used require fewer assumptions about the data or how it was collected. There’s more detail on the differences here.

What does the work involve? What is it like day-to-day?

In the US, a PhD usually lasts 5-6 years. In your first two years you will take classes and in the remaining years you’ll do research. PhDs in the UK are shorter – usually 4 years – and you only do research. Unlike US PhDs, they often require you to have done a master’s, although this depends on the university.

For your research, you write a dissertation, which is a long and in depth exploration of a particular topic or (more often in the US) a collection of papers on related topics. Your research will go through several stages, starting with refining the topic you’ll explore, then working on research projects related to it, and finally writing up your dissertation.

You’ll spend most of your time programming, doing maths, reading papers, and thinking about and discussing ideas with collaborators.

Check out these day-in-the-life profiles and this description of what it’s like to do research in machine learning.

If you want to have a positive impact, why do a machine learning PhD?

1. Learn about perhaps the most important technology of the next decade

Machine learning has made rapid progress in the past decade, enabled by theoretical breakthroughs, increasing availability of data, increased investment, and increased processing power. It’s been applied to many previously unsolved tasks with success, including autonomous driving,⁶ image annotation,⁷ game-playing,⁸ helicopter flying,⁹ speech synthesis,¹⁰ and movie recommendations.¹¹

Commercial interest has exploded, driving up demand for skilled employees and interest from big companies in acquiring machine learning startups.¹² The median price paid per employee in AI acquisitions since 2014 that were mostly to acquire the team was $2.5m per employee, with one paying $10m per employee.¹³

If the technology continues to improve, we will be able to automate more and more human labour, and solve previously intractable problems. Eventually we could create software that is more capable than humans at most tasks.

This progress could radically transform society, for good or ill. We could see a dramatic reduction in traffic accident deaths due to autonomous cars, cheaper and more accurate medical diagnosis, and the automation of work that is dangerous or unpleasant.

But it could also lead to autonomous weapons, widespread unemployment and concentration of political and economic power in the hands of a few people.

More worryingly, if we develop software that is both highly intelligent and misaligned with our interests, then this could have catastrophic consequences. We discuss this more in our profile on positively shaping the development of artificial intelligence.

For these reasons, we think machine learning is one of the most important fields to understand for the coming decades. And despite a recent growth in interest, it’s still a skill set possessed by relatively few people.

There are several ways you could use a machine learning PhD for good, as we sketch below.

2. Work on positively shaping the development of artificial intelligence

Positively shaping the development of artificial intelligence is our highest-scoring problem area and needs more people with machine learning expertise, so we think that working on it will be the best option for people with machine learning PhDs.

(That doesn’t apply if you already have significant experience in another high-priority problem area or if you disagree with our assessment of the importance of working on this.)

Within this area, there are two main paths.

Technical safety research

There is a shortage of people with the skills to do technical research into reducing the risks to society posed by artificial intelligence. A machine learning PhD can be excellent preparation for this. Read more about why and how to enter this path, and whether a PhD is needed, in our full profile. Or check out this guide to working on AI safety.

Policy and strategy research

AI policy helps decision-makers in institutions such as governments, companies, and nonprofits design and implement policies that will help shape the future of AI. Policy roles include researchers who develop policy options and practitioners who advocate for and implement policies.

A machine learning PhD is a good preparation for this because it gives you the technical background and credibility with other policy people. We have more detail on this path here.

Want to work on positively shaping the development of AI? We want to help.

We’ve helped dozens of people formulate their plans, and put them in touch with academic mentors. If you want to work on AI safety, apply for our free coaching service.

Apply for coaching

3. Apply machine learning to other socially important problems

In the US, sepsis and septic shock account for 10% of all intensive care admissions and 20-30% of all hospital deaths.¹⁴ Scientists at Johns Hopkins University have developed a machine learning system called TREWScore to help tackle this problem. It can identify patients at high risk of developing septic shock hours before standard screening methods, enabling faster treatment.¹⁵

There are many urgent problems in which machine learning can be applied to help others, including:

To do this kind of work you’ll need an understanding of socially important problems that you could apply your skills to. You can gain this understanding by working in companies or on research projects that are trying to solve one of these problems and by talking to lots of other people working on them. You might work in an established company, start your own company, or do academic research.

4. High-earning back-up options

Demand for machine learning expertise has led to high salaries:

Percentile	Machine Learning Engineer – Total compensation¹⁶
90%	$240,000
75%	$192,000
50%	$150,000
25%	$117,000

Given that these figures include roles that only require a master’s or bachelor’s, we expect the earnings with a PhD will be on the higher end of this range.

This sort of money would enable you to earn to give. However, we think the other ways of doing good with machine learning are likely to be higher impact than the impact of the donations from an industry job. So we only suggest earning-to-give as a temporary or backup option if other paths don’t work out.

Earning potential is high (and we expect it to remain high in the next decade) because of the rapid progress in machine learning and its usefulness in solving a wide variety of problems. However, there is a risk that salaries could fall because many people are interested in entering the field.

Machine learning skills are useful in tech startups, and there has been a recent proliferation of machine learning startups. Y Combinator has for the first time ever added a specialist track for AI startups. Large companies have been acquiring AI startups¹² in recent years, and the value of these acquisitions is often based on the team acquired rather than more usual indicators of value such as revenue.

The median price paid per employee in AI acquisitions since 2014 that were mostly to acquire the team was $2.5m per employee, with one paying $10m per employee.¹³ See our profile on tech entrepreneurship for more on this path.

According to someone we spoke to in the industry, a machine learning PhD is a good preparation for getting a high earning job in a quantitative hedge fund. You can read more about jobs in quantitative hedge funds in our profile here.

5. Intellectually interesting work with a lot of autonomy

You get to do intellectually demanding work with some of the most able people on earth and you’ll develop a satisfyingly deep understanding of your field. And you have the freedom to choose what to work on and when.

Downsides

1. Tough for mental health

Although the work you do in a PhD can be satisfying, PhDs are notorious for being a psychological struggle. This is often due to feelings of isolation and difficulty adapting to highly autonomous work: “Research can be very rewarding and very frustrating. Most students describe graduate school as a roller-coaster with tremendous highs and tremendous lows.”¹⁷ Check out the section on mental health in our article on how to be more successful for how to deal with this.

2. Takes a long time

PhDs take a long time (4-6 years), during which you have relatively low pay. If you drop out, you lose most of the value of the PhD, making it more risky than other options which have shorter pay offs.

It may be possible to skip a PhD and get a less senior position in the industry within a year; we interviewed two people who dropped out of their PhDs but nonetheless managed to quickly get valuable jobs in ML engineering.

3. Machine learning may become a lot more competitive

Machine learning is a hot area a lot of people want to get into, so there is a risk that it could become more difficult to get jobs as lots of people crowd into the area. For example, MIT’s introduction to machine learning course recently had 700 people sign up – they had to use an overflow lecture room and deliberately weed people out of the course early.¹⁸ If machine learning turns out to progress more slowly than is expected, and doesn’t live up to the hype, then the number of jobs might shrink as well.

How do you get in?

To be accepted, you need to have strong quantitative skills, which you’d usually gain through doing an undergraduate degree such as computer science, maths, engineering, quantitative economics, or physics. At a minimum, you should have covered probability and statistics, multivariable calculus, and linear algebra.

We know of people who have been accepted into machine learning master’s without a quantitative background, although this is rare. This would require doing self-study or taking courses (such as from the Open University) in the mathematical prerequisites.

In the UK and the rest of Europe you’ll often need a master’s degree in machine learning or a related subject such as computer science or maths, although this depends on the university. In the US, Canada, and Australia it’s often not necessary, although it can improve your application and helps you to test your interest in and aptitude for research before committing to a PhD.

Programmes that don’t require a master’s are often 1-2 years longer than those that do. If you’re planning to do a master’s, then two year degrees are better because you have more time to do research, which is key for getting into top PhD programs.

According to online guides and people we spoke to currently doing a machine learning PhD, getting admitted is almost entirely dependent on showing that you can do good research.

How to show you can do good research

First, you need to have done research before. If you’re still an undergraduate, do research with an academic at your university or get a summer research position. Many research groups have summer research positions for undergraduates, some of which are paid (such as REUs).

If you do a master’s, choose one that has a strong research component and dive into research as soon as you can. Ideally, you should get 1-2 publications before you finish your degree. At the least, you should have completed a research paper even if it’s not published – a workshop paper, paper under review, or a paper on Arxiv (which anyone can submit to) is still useful. A completed research paper is important enough that you should delay finishing your degree until you’ve done this.

Second, you need good letters of recommendation, ideally from people who are active in your field and known by the academics assessing your application. You want recommendations that highlight your research potential, rather than just your ability to do well in class.

Third, you need a personal statement that emphasises your research experience and what you’d like to work on.

Fourth, unless you’re planning to do purely theoretical research, you need demonstrated programming ability. The best way to demonstrate this is by taking programming classes, having a portfolio of programming projects, or having commercial programming experience through jobs or internships.

Although not essential, you could also contribute towards widely used open source machine learning packages, write blog posts about implementing machine learning techniques, or take part in competitions such as kaggle.

Grades (and GRE results if applying to US universities) are much less important than your research experience and letters of recommendation. The following are roughly what grades you should aim at, but are not hard cut-offs:

According to this guide and experts we spoke to, your grade-point average (if you went to a US university) should be about 3.6 or higher. Above 3.6 doesn’t help you much in comparison to having more research experience.
If you did your undergraduate degree in the UK, you should have got a first or high upper second.
Your overall GRE should usually be in the 90th percentile or higher. Scores on the quantitative section matter more than verbal and essay scores, you should aim for around 165 and above (>= 95th percentile) for the quantitative score and the overall score being above the 90th percentile.

There has been increased interest in machine learning PhDs and many departments have had a record-high number of applications this year. This is likely to increase the level of entry requirements in the next few years, although more places may become available because of industry funding.

For more tips, read this guide on applying to PhD programs in computer science, as it is packed with detailed, practical advice.

How to work out whether the PhD would be right for you

Research is different from taking classes, and excelling in research takes different skills: you need to be comfortable doing autonomous work with little feedback, and concentrating on one narrow topic for several years. There’s more detail on what PhD research is like in part two of this guide.

How to test your fit

It’s hard to know whether you’d suit machine learning research without trying it. To test your suitability, here are some steps you can take, in order of how much time they take:

Talk to a few people who are doing machine learning PhDs to find out more about what it’s like and whether you’d suit it.
Take a machine learning online course such as this one from Coursera or a class at your university.
Take part in online competitions.
Go through additional courses and textbooks. We recommend some especially good resources here.
Read research papers and try replicating their results (Andrew Ng recommends this to become an excellent machine learning researcher). Our interview with Dario Amodei from OpenAI has more detail on how to do this. Some papers you could do this with:
Do a summer research internship (suggestions below).
Do a master’s degree that includes research projects.

Should I do something else instead?

People with the quantitative and computational skills required to enter a machine learning master’s or PhD could go into a variety of other careers.

Here are some questions to help you judge whether you might better suited to an alternative.

Do I want to be a researcher and do I have a good chance of making it in academia or a top company?

If yes, doing a PhD could be a good option.

If no then you might want to consider data science, software engineering, quantitative trading, or starting your own tech company.

Is machine learning relevant to problems that I think could be the highest priority?

If you think a PhD is a good option for you and you agree with us that positively shaping the development of artificial intelligence is one of the highest priorities, then doing a machine learning PhD is one of your best options. If you have a greater interest in AI policy, you have a background in economics, or you dislike programming then an economics PhD may be a better option. Similarly, if you think global poverty or global priorities research is a higher priority problem area, then an economics PhD might be a better option.

Would I prefer a path within machine learning that doesn’t require a PhD?

If you’re going for machine learning research positions in academia or industry, you’ll need a PhD. It can also be helpful to have a PhD (or even further academic experience) to set up a startup that develops cutting-edge techniques. But for most non-research positions in industry, including at top firms such as Google, a master’s is sufficient. For industry jobs that aren’t pure research, going on to get a PhD isn’t an advantage and probably isn’t worth the time cost.

Also bear in mind that many data-analysis problems that companies and nonprofits have don’t require machine learning to solve – often good solutions can be built using simple data science methods instead.¹⁹

Options similar to a machine learning PhD

Computer science PhD outside of machine learning. We think this will usually be a worse option than a machine learning PhD because of the importance of machine learning for the future of AI and its wide applicability.
Statistics PhD. There is a lot of overlap between machine learning and statistics, so it’s worth applying to statistics departments that do machine learning work, although a machine learning PhD is still preferable if you want to go into that field.

How to choose your programme

Which research group and institution?

The most important criteria according to online guides and people we spoke to in the field are:

Research group prestige. You want to be working in a group that consistently gets papers into the top machine learning conferences. You can find a list of top conferences here.
What your potential adviser is like: Don’t only focus on the prestige of the university or department – your adviser is critical to your success because they’re your main mentor.
- Choose an adviser who you can work well with and who supports you. Check with current students to see if they’re happy with advisers you’re thinking of working with.
- Make sure their research interests are aligned with yours.
- If the first two criteria are met, go for successful, prestigious advisers. Famous researchers such as those in charge of a large lab often have better funding and connections but can be difficult to work with as they’re often too busy to give feedback on your work. It can be better to have an adviser who is less famous but can support you more such as an up-and-coming junior professor. Alternatively, you can have both in universities that allow you to have more than one adviser. It’s also helpful if there is more than one adviser available in your area of interest.
Adviser and lab stability and resources. Look for a lab and adviser that has guaranteed funding for the period you’ll be there. Alternatively, fellowships (such as those from the NSF and NSERC) can allow you to work with advisers who have less funding. Also check to see if your potential adviser is thinking of moving labs during your PhD. Having your adviser leave can be catastrophic for your PhD as you may find it difficult to find another adviser with the right expertise for your research.
Teaching load: some departments require you to do a lot of teaching work that leaves you less time for research.
Lab atmosphere and compatibility: You’ll be collaborating with and learning from other students and postdocs so check that they tend to be available to help you out and that you get on. Check that there is a friendly atmosphere in the lab by sitting in on lab meetings if possible, or by talking to current students there.
A location where you’ll be happy: It might be important for you to be close to family and locations where you don’t speak the local language well could become lonely.
Presence of researchers working on topics you think are particularly high impact: For example, if you want to work on AI safety it’s useful to be at a university with an AI safety research group so that you can work on it during your PhD and build connections with other researchers.

Nice-to-haves:

Local industry: If you have some idea of what you’d like to do post-PhD, you can upweight locations with compatible industry. So if you’re interested in startups or big tech companies, go to universities in the San Francisco Bay Area such as Stanford and Berkeley; for getting better connections with DeepMind, go to universities in the UK.
Overall institutional prestige: If you’re going to be staying in machine learning work, research group prestige within machine learning is much more important than the prestige of the university as a whole.

This quora question on the best machine learning graduate schools will give you a place to start.

There are more details on choosing a PhD programme in this guide.

Which topic?

Some of the most promising topics include the following:

Deep learning

Although different machine learning methods are useful for different applications, deep learning is the one which has seen a lot of success in the past decade. It’s an approach loosely inspired by the network structure of our brains, although there are many differences. There’s a more detailed explanation of how it works here.

Deep learning has helped us solve problems that previous artificial intelligence techniques could not. These include successes in image recognition, “predicting the activity of potential drug molecules, analysing particle accelerator data, reconstructing brain circuits, and predicting the effects of mutations in non-coding DNA on gene expression and disease.”²⁰

Because of its success, deep learning expertise is valuable both to companies doing work on the cutting edge of machine learning and to people working on positively shaping the development of AI.

Reinforcement learning

In reinforcement learning, the software takes actions in its environment to maximise a reward defined by the programmers. For example, this video shows DeepMind’s atari game-playing software learning to play breakout. It aims to maximise the score on the screen and takes action by moving the paddle back and forth.

Reinforcement learning is important because it’s a promising approach to creating software that, like humans, has long term goals and uses trial and error to learn what works best in its environment.²¹ ²²

Getting into a PhD programme in reinforcement learning may be less difficult than programmes in deep learning, given that deep learning is currently receiving a lot of attention. You can also do both in a deep reinforcement learning PhD.

Applications

If you’re going to work on a particular application of machine learning such as vision or speech recognition, there are a few things to bear in mind. It doesn’t matter too much which application you choose as long as you’re building skills in the underlying machine learning methods. But it’s still worth putting some thought into which application to work on. The ideal is an area where progress is being made but hasn’t levelled off yet. You can find these areas by looking at this article, which has graphs showing the progress of machine learning in many specific applications.

For more detail on choosing the right programme, see here.

Get industry experience during your PhD

Industry internships can be a useful complement to a PhD as they expose you to how industry works, train you in the tools available (such as Google’s infrastructure), can get you job offers, and can stimulate new research directions. Judging by reputation in the machine learning community, institutions fall into several tiers:

Top tier: DeepMind, OpenAI, FAIR
Second tier: Baidu, Microsoft, Amazon, Twitter, Apple, IBM
Third tier: deep learning startups (list1, list2), machine learning bootcamps, which train PhD graduates in applying their skills to industry (UK, US)

There is more detail on the reputations of the top institutions here.

Many of the organisations that work on AI safety also take interns.

How to get machine learning experience while doing a non-machine learning PhD

It’s possible to do machine learning research while doing another quantitative PhD, such as in applied maths, statistics, or physics. If you’d like to explore this possibility, start by going to talks and research group meetings at your university’s machine learning lab. Then see if you can do some work with a researcher there (you can usually do this without having to officially switch departments) or arrange a research visit to the group for a few months.

Learn more

Top recommendations

Further recommendations

An introduction to AI and machine learning with lots of links to other resources
Applying to PhD programs in computer science
Get into graduate school for science, engineering, math, and computer science
Our AI safety syllabus, which has a list of resources for learning machine learning
Our profile on positively shaping artificial intelligence
Podcast: Our interview with Paul Christiano on how OpenAI is developing real solutions to the ‘AI alignment problem’, and his vision of how humanity will progressively hand over decision-making to AI systems
Podcast: Our interview with Dario Amodei from OpenAI which has advice on getting into machine learning and describes specific ways in which machine learning algorithms could act dangerously.
Our guide to AI policy careers
Podcasts: Chris Olah on working at top AI labs without an undergrad degree and what the hell is going on inside neural networks.

Want to use an ML PhD to make the world a better place? We want to help.

We’ve coached dozens of people considering PhDs, and can often put you in touch with relevant experts for more guidance. Apply for our free coaching service, particularly if you want to work on AI safety: