Update September 2019:Having been written in 2017, this article no longer reflects our most recent understanding of the risks and opportunities posed by artificial intelligence. While it’s still useful, until we get to update it we suggest supplementing it with our other more recent articles and interviews about technical AI safety and AI policy.
“There is no doubting the force of [the] arguments … the problem is a research challenge worthy of the next generation’s best mathematical talent. Human civilisation is at stake.” – Clive Cookson, Science Editor at the Financial Times
Around 1800, civilization underwent one of the most profound shifts in human history: the industrial revolution.17
This wasn’t the first such event – the agricultural revolution had upended human lives 12,000 years earlier.
A growing number of experts believe that a third revolution will occur during the 21st century, through the invention of machines with intelligence which far surpasses our own. These range from Stephen Hawking to Stuart Russell, the author of the best-selling AI textbook, AI: A Modern Approach.3
Rapid progress in machine learning has raised the prospect that algorithms will one day be able to do most or all of the mental tasks currently performed by humans. This could ultimately lead to machines that are much better at these tasks than humans.
These advances could lead to extremely positive developments, presenting solutions to now-intractable global problems, but they also pose severe risks. Humanity’s superior intelligence is pretty much the sole reason that it is the dominant species on the planet. If machines surpass humans in intelligence, then just as the fate of gorillas currently depends on the actions of humans, the fate of humanity may come to depend more on the actions of machines than our own. For a technical explanation of the risks from the perspective of computer scientists, see these papers.1218
This might be the most important transition of the next century – either ushering in an unprecedented era of wealth and progress, or heralding disaster. But it’s also an area that’s highly neglected: while billions are spent making AI more powerful,9 we estimate fewer than 100 people in the world are working on how to make AI safe.8
This problem is an unusual one, and it took us a long time to really understand it. Does it sound weird? Definitely. When we first encountered these ideas in 2009 we were sceptical. But like many others, the more we read the more concerned we became.19 We’ve also come to believe the technical challenge can probably be overcome if humanity puts in the effort.
Working on a newly recognized problem means that you risk throwing yourself at an issue that never materializes or is solved easily – but it also means that you may have a bigger impact by pioneering an area others have yet to properly appreciate, just like many of the highest impact people in history have done. In what follows, we will cover the arguments for working on this area, and look at the best ways you can contribute.
Many experts believe that there is a significant chance that humanity will develop machines more intelligent than ourselves during the 21st century. This could lead to large, rapid improvements in human welfare, but there are good reasons to think that it could also lead to disastrous outcomes. The problem of how one might design a highly intelligent machine to pursue realistic human goals safely is very poorly understood. If AI research continues to advance without enough work going into the research problem of controlling such machines, catastrophic accidents are much more likely to occur. Despite growing recognition of this challenge, fewer than 100 people worldwide are directly working on the problem.
Our overall view
This is among the most pressing problems to work on.
We think work on positively shaping AI has the potential for a very large positive impact, because the risks AI poses are so serious. We estimate that the risk of a severe, even existential catastrophe caused by machine intelligence within the next 100 years is something like 10%.
The problem of potential damage from AI is somewhat neglected, though it is getting more attention with time. Funding seems to be on the order of 100 million per year. This includes work on both technical and policy approaches to shaping the long-run influence of AI by dedicated organisations and teams.
Making progress on positively shaping the development of artificial intelligence seems moderately tractable, though we’re highly uncertain. We expect that doubling the effort on this issue would reduce the most serious risks by around 1%.
This profile is based on interviews with: Professor Nick Bostrom at the University of Oxford, the author of Superintelligence; an anonymous leading professor of computer science; Jaan Tallinn, one of the largest donors in the space and the co-founder of Skype; Jan Leike, a machine learning researcher now at DeepMind; Miles Brundage, an AI policy researcher at the Future of Humanity Institute at Oxford University; Nate Soares, the Executive Director of the Machine Intelligence Research Institute; Daniel Dewey, who works full-time finding researchers and funding opportunities in the field; and several other researchers in the area. We also read advice from David Krueger, a Machine Learning PhD student.
The arguments for working on this problem area are complex, and what follows is only a brief summary. If you prefer video, then see this TED talk:
Those who are already familiar with computer science may prefer to watch this talk by University of California Berkeley Professor of Computer Science Stuart Russell instead, as it goes further into potential research agendas.
Recent progress in machine learning suggests that AI’s impact may be large and sudden
When Tim Urban started investigating his article on this topic, he expected to finish it in a few days. Instead he spent weeks reading everything he could, because, he says, “it hit me pretty quickly that what’s happening in the world of AI is not just an important topic, but by far the most important topic for our future.”
In October 2015 an AI system named AlphaGo shocked the world by defeating a professional at the ancient Chinese board game of Go for the first time. A mere five months later, a second shock followed: AlphaGo had bested one of the world’s top Go professionals, winning 4 matches out of 5. Seven months later, the same program had further improved, crushing the world’s top players in a 60-win streak. In the span of a year, AI had advanced from being too weak to win a single match against the worst human professionals, to being impossible for even the best players in the world to defeat.
This was shocking because Go is considered far harder for a machine to play than Chess. The number of possible moves in Go is vast, so it’s not possible to work out the best move through “brute force”. Rather, the game requires strategic intuition. Some experts thought it would take at least a decade for Go to be conquered.1
Since then, AlphaGo has discovered that certain ways of playing Go that humans had dismissed as foolish for thousands of years were actually superior. Ke Jie, the top ranked go player in the world, has been astonished: “after humanity spent thousands of years improving our tactics,” he said, “computers tell us that humans are completely wrong. I would go as far as to say not a single human has touched the edge of the truth of Go.”2
The advances above became possible due to progress in an AI technique called “deep learning”. In the past, we had to give computers detailed instructions for every task. Today, we have programs that teach themselves how to achieve a goal – for example, a program was able to learn how to play Atari games based only on reward feedback from the score. This has been made possible by improved algorithms, faster processors, bigger data sets, and huge investments by companies like Google. It has led to amazing advances far faster than expected.
But those are just games. Is general machine intelligence still far away? Maybe, but maybe not. It is really hard to predict the future of technology, and lots of past attempts have been completely off the mark. However, the best available surveys of experts assign a significant probability to the development of powerful AI within our lifetimes.
One survey of the 100 most-cited living computer science researchers, of whom 29 responded, found that more than half thought there was a greater than 50% chance of “high-level machine intelligence” – one that can carry out most human professions at least as well as a typical human – being created by 2050, and a greater than 10% chance of it happening by 2024 (see figure below).34
When superintelligent AI arrives, it could have huge positive and negative impacts
If the experts are right, an AI system that reaches and then exceeds human capabilities could have very large impacts, both positive and negative. If AI matures in fields such as mathematical or scientific research, these systems could make rapid progress in curing diseases or engineering robots to serve human needs.
On the other hand, many people worry about the disruptive social effects of this kind of machine intelligence, and in particular its capacity to take over jobs previously done by less skilled workers. If the economy is unable to create new jobs for these people quickly enough, there will be widespread unemployment and falling wages.5 These outcomes could be avoided through government policy, but doing so would likely require significant planning.
However, those aren’t the only impacts highly intelligent machines could have.
Professor Stuart Russell, who wrote the leading textbook on artificial intelligence, has written:6
Success brings huge risks. … the combination of [goal] misalignment with increasingly capable decision-making systems can lead to problems – perhaps even species-ending problems if the machines are more capable than humans.
Here is a highly simplified example of the concern:
The owners of a pharmaceutical company use machine learning algorithms to rapidly generate and evaluate new organic compounds.
As the algorithms improve in capability, it becomes increasingly impractical to keep humans involved in the algorithms’ work – and the humans’ ideas are usually worse anyway. As a result, the system is granted more and more autonomy in designing and running experiments on new compounds.
Eventually the algorithms are assigned the goal of “reducing the incidence of cancer,” and offer up a compound that initial tests show is highly effective at preventing cancer. Several years pass, and the drug comes into universal usage as a cancer preventative…
…until one day, years down the line, a molecular clock embedded in the compound causes it to produce a potent toxin that suddenly kills anyone with trace amounts of the substance in their bodies.
It turns out the algorithm had found that the compound that was most effective at driving cancer rates to 0 was one that killed humans before they could grow old enough to develop cancer. The system also predicted that its drug would only achieve this goal if it were widely used, so it combined the toxin with a helpful drug that would incentivize the drug’s widespread adoption.
Of course, the concern isn’t about this example specifically, but rather similar unintended consequences. These reemerge for almost any goal researchers have yet come up with to offer a superintelligent machine.7 And all it takes is for a single super-intelligent machine in the world to receive a poor instruction, and it could pose a large risk.
The smarter a system, the harder it becomes for humans to exercise meaningful oversight. And, as in the scenario above, an intelligent machine will often want to keep humans in the dark, if obscuring its actions reduces the risk that humans will interfere with it achieving its assigned goal.
You might think ‘why can’t we just turn it off?’, but of course an intelligent system will give every indication of doing exactly what we want, until it is certain we won’t be able to turn it off.
An intelligent machine may ‘know’ that what it is doing is not what humans intended it to do, but that is simply not relevant. Just as a heat-seeking missile follows hot objects, by design a machine intelligence will do exactly, and literally, what we initially program it to do. Unfortunately, intelligence doesn’t necessarily mean it shares our goals. As a result it can easily become monomaniacal in pursuit of a supremely stupid goal.
The solution is to figure out how to ensure that the instructions we give to a machine intelligence really capture what we want it to do, without any such unintended outcomes. This is called a solution to the ‘control’ or ‘value alignment’ problem.
It’s hard to imagine a more important research question. Solving the control problem could mean the difference between enormous wealth, happiness and health — and the destruction of the very conditions which allow humanity to thrive.
Few people are working on the problem
While the stakes seem huge, the effort being put into avoiding these hazards is small. Global spending on research and action to ensure that machine intelligence is developed safely will come to only $9 million in 2017.8 By comparison, over 100 times as much is spent trying to speed up the development of machine intelligence,9 and 26,000 times as much is spent on biomedical research.10
That said, the field of AI safety research is growing quickly – in 2015, total spending was just $3 million.
Technical research refers to work in mathematics and AI to solve the control problem. Strategy research is focused on broader questions about how to safely develop AI, such as how it should be regulated.
As we’d expect from the above, recent investment into technical research on the control problem has already yielded significant results. We’ve detailed some of these findings in this footnote.11 While few of the technical issues have been resolved, we have a much clearer picture today of how intelligent systems can go wrong than a few years ago, which is the first step towards a solution.12
There has also been recent progress on better understanding the broader ‘strategic’ issues around AI. For instance, there has been research into how the government should respond to AI, covering arms races,13 the implications of sharing research openly,14 and the criteria on which AI policy should be judged.15 That said, there is still very little written on these topics, so single papers can be a huge contribution to the literature.
Even if – as some have argued – meaningful research were not possible right now, it would still be possible to build a community dedicated to mitigating these risks at a future time when progress is easier. Work by non-technical people has helped to expand funding and interest in the field a great deal, contributing to the recent rapid growth in efforts to tackle the problem.
Example: Paul Christiano used his math skills to tackle technical challenges
What are the major arguments against this problem being pressing?
Not everyone agrees that there is a problem or if there is, that it’s a top priority
Here are some examples:
Benjamin Garfinkel at Oxford University’s Future of Humanity Institute has scrutinised the arguments above, and concluded that while he’s sympathetic to work aimed at positively shaping AI, “…not enough work has gone into analyzing the case for prioritizing AI. Existing published arguments are not decisive.” We encourage you to read his analysis.
Some believe that artificial intelligence, even if much more intelligent than humans in some ways, will never have the opportunity to cause destruction on a global scale. For an example of this, see economist Robin Hanson, who believes that machines will eventually become better than humans at all tasks and supercede us, but that the process will be gradual and distributed enough to ensure that no one actor is ever in a position to become particularly influential. His views are detailed in his book The Age of Em.16
Some believe that it will be straightforward to get an intelligent system to act in our interests. For an example of this, see Holden Karnofsky arguing in 2012 that we could design AIs to work as passive tools rather than active agents (though he has since changed his view significantly and now represents one of the field’s major funders).
Neil Lawrence, an academic in machine learning, takes issue with many predictions in Bostrom’s book Superintelligence, including our ability to make meaningful predictions far into the future.
The fact that there isn’t a consensus that smarter than human AI is coming soon and will be dangerous is a relief. However, given that a significant and growing fraction of relevant experts are worried, it’s a matter of prudent risk management to put effort into the problem in case they are right. You don’t need to be 100% sure your house is going to burn down to buy fire insurance.
We aren’t the most qualified to judge, but we have looked into the substantive issues and mostly found ourselves agreeing with those who are more worried than less.
It may be too early to work on it
If the development of human-level machine intelligence is hundreds of years away, then it may be premature to research how to align it with human values. For example, the methods used to build machine intelligence may end up being completely different from those we use to develop AI now, rendering today’s research obsolete.
However, the surveys of computer scientists show that there’s a significant chance – perhaps around 10% – that human level AI will arrive in 10-20 years. It’s worth starting now just in case this fast scenario proves to be accurate.
Furthermore, even if we knew that human level AI was at least 50 years away, we don’t know how hard it will be to solve the ‘control problem’. The solution may require a series of insights that naturally come one after another. The more of those insights we build up ahead of time, the more likely it is that we’ll be able to complete the solution in a rush once the nature of AI becomes clear.
Additionally, acting today could set up the infrastructure necessary to take action later, even if research today is not directly helpful.
It could be very hard to solve
As with many research projects in their early stages, we don’t know how hard this problem is to solve. Someone could believe there are major risks from machine intelligence, but be pessimistic about what additional research will accomplish, and so decide not to focus on it.
It may not fit your skills
Many individuals are concerned about this problem, but think that their skills are not a natural fit for working on it, so spend their time working on something else. This is likely true for math-heavy technical research roles, though below we also describe operational and support roles that are a good fit for a wider range of people.
It is probably possible to design a machine that is as good at accomplishing its goals as humans, including ‘social’ tasks that machines are currently hopeless at. Experts in artificial intelligence assign a greater than 50% chance of this happening in the 21st century.
Without careful design for reliability and robustness, machine intelligence may do things very differently than what humans intended – including pursuing policies that have a catastrophic impact on the world.
Even if advanced machine intelligence does not get ‘out of control’, it is likely to be very socially disruptive and could be used as a destabilizing weapon of war.
It is unknown how fast progress on this problem can be made – it may be fast, or slow.
Want to work on AI safety? We want to help
We’ve helped dozens of people formulate their plans, and put them in touch with academic mentors. If you want to work on AI safety:
Or join our newsletter and get notified when we release new problem profiles.
What can you do to help?
We’ve broken this section into five parts to cover the main paths to making a difference in this area.
1. Technical research
Ultimately the problem will require a technical solution – humans will need to find a way to ensure that machines always understand and comply with what we really want them to do. But few people are able to do this research, and there’s currently a surplus of funding and a shortage of researchers.
So, if you might be a good fit for this kind of research, it could well be one of the highest-impact things you can do with your life.
Researchers in this field mostly work in academia and technology companies such as Google Deepmind or OpenAI. You might be a good fit if you would be capable of completing a PhD at a top 10 program in computer science or a similar quantitative course (though it’s not necessary to have such a background). We discuss this path in detail here:
If improvements in artificial intelligence come to represent the most important changes in the 21st century, governments are sure to take a keen interest. For this reason, there is a lot of interest in strategic and policy research – attempts to forecast how a transition to smarter-than-human machine intelligence could occur, and what the response by governments and other major institutions should be.
This is a huge field, but some key issues include:
How should we respond to technological unemployment if intelligent systems rapidly displace human workers?
How do we avoid an ‘arms race’ in which countries or organisations race to develop strong machine intelligences, for strategic advantage, as occurred with nuclear weapons?
When, if ever, should we expect AI to achieve particular capabilities or reach human-level intelligence?
If we handle these issues badly, it could lead to disaster, even if we can solve the technical challenges associated with controlling a machine intelligence. So there’s a real need for more people to work on them.
Even in a research organisation, around half of the staff will be doing other tasks essential for the organisation to continue functioning and have an impact. Having high-performing people in these roles is essential. Better staff allow an organisation to grow more quickly, avoid major mistakes, and have a larger impact by communicating its ideas to more people.
Our impression is that the importance of these roles is underrated because the work is less visible. Some of the people who have made the largest contributions to solving this problem have done so as communicators and project managers. In addition, these roles are a good fit for a large number of people.
Organisations working on AI safety need a wide range of complementary skills:
This path is open to many people who can perform at a high level in these skills.
To get into these roles you’ll want to get similar jobs in organisations known for requiring high-quality work and investing in training their staff. We have more about how to skill up in our article on career capital.
Example: Seán Ó Heigeartaigh helped grow the Future of Humanity Institute
Seán Ó hÉigeartaigh was the Academic Project Manager (and later, Senior Academic Manager) at the Future of Humanity Institute during 2011-15, while its Director was focused on writing a book. He played a crucial role in ensuring the Institute ran smoothly, more than doubled its funding, co-wrote further successful grants including several AI strategy-relevant grants and communicated its research effectively to the media, policymakers and industry partners. During his time he helped grow FHI from 5 to 16 staff, and put in place and oversaw a team of project managers and administrators, including a Director of Research and Director of Operations to whom he transferred his responsibilities upon moving on. His experience doing this allowed him to be a key player in the founding of the Cambridge Centre for the Study of Existential Risk, and later, the Centre for the Future of Intelligence.Read more…
4. Advocacy and capacity building
People who are relatively strong on social skills might be able to have a larger impact by persuading others to work on or fund the problem. This is usually done by working at one of the research organisations already mentioned.
Beyond that, the group we know that is doing this the most to raise awareness of the issue is the effective altruism community, of which we are a part. Joining and growing that movement is a promising way to increase efforts to solve the AI control problem, among other pressing problems.
Once you are familiar with the issue, you could also spend some of your time spreading the word in any of the careers that typically provide you with a platform for advocacy, such as:
You could also rise up the ranks of an organisation doing some relevant work, such as Google or the US military, and promote concern for AI safety there.
However, unless you are doing the technical, strategic or policy research described above, you will probably only be able to spend a fraction of your time on this work.
We would also caution that it is easy to do harm while engaging in advocacy about AI. If portrayed in a sensationalist manner, or by someone without necessary technical understanding, ‘advocacy’ can in fact simply be confusing and make the issue appear less credible. Much coverage of this topic in the media misrepresents the concerns actually held by experts. To avoid contributing to this we strongly recommend informing yourself thoroughly and presenting any information in a sober, accurate manner.
Example: Niel switched from physics to research management
Niel Bowerman studied Physics at Oxford University and planned to enter climate policy. But as a result of encountering the ideas above, he changed his career path, and became the Assistant Director at the Future of Humanity Institute, working on the Institute’s operations, fundraising and research communication. Through this work, Niel was involved in raising £3 million for the Institute, contributing to doubling its size. As a result, they have been able to hire a number of outstanding additional technical and strategic researchers. Read Niel’s story…
5. Earning to give
There is an increasing amount of funding available for research in this area, and we expect more large funders to enter the field in future. That means the problem is primarily talent constrained – especially by a need for innovative researchers.
However, there are still some funding gaps, especially among the less conventional groups that can’t get academic grants, such as the Machine Intelligence Research Institute.
As a result earning to give to support others working on the problem directly is still a reasonable option if you don’t feel the other roles described here are a good fit for you.
If you want to donate, our first suggestion is giving to the Long Term Future Fund. The manager of the fund is an expert in catastrophic risk funding, and makes grants to the organisations that are most in need of funding at the time. It’s run by the Centre for Effective Altruism, of which we’re part.
Alternatively you can choose for yourself among the top nonprofit organisations in the area, such as the Machine Intelligence Research Institute in Berkeley and the Future of Humanity Institute at Oxford. These were the most popular options among experts in our review in December 2016. See more organisations below.
Find opportunities on our job board
What are the key organisations you could work for?
The most significant organisations, all of which would be good places to work, are probably the following:
Alan Dafoe’s research group at Yale University is conducting research on the ‘global politics of AI’, including its effects on international conflict. PhD or research assistant positions may be available.
AI Impacts is a nonprofit which works on forecasting progress in machine intelligence and predicting its likely impacts.
The world’s most intellectual foundation is hiring. Holden Karnofsky, founder of GiveWell, on how philanthropy can have maximum impact by taking big risks (includes a discussion of their work in positively shaping the development of AI).
Müller, Vincent C. and Bostrom, Nick (2016), ‘Future progress in artificial intelligence: A survey of expert opinion’, in Vincent C. Müller (ed.), Fundamental Issues of Artificial Intelligence (Synthese Library; Berlin: Springer), 553-571. Web archive.↩
Armellini, Mauricio, and Tim Pike. “Should economists be more concerned about Artificial Intelligence?” Bank Underground. February 24, 2017. Accessed March 06, 2017. Web archive.↩
Our estimate of ‘under 100’ is based on an informal count of people doing directly relevant work at the organisations in this article, which is significantly below 100. We cross-check this by estimating the cost per full time staff member: the forecast spending of $9m in 2017 would not be enough to sustain more than 100 staff members given the high cost of hiring machine learning researchers.
Note that this figure could be inaccurate if there is a large and non-public AI safety project, but we think this is probably not the case.↩
“Spending on AI technologies by companies is expected to grow to $47 billion in 2020 from a projected $8 billion in 2016, according to IDC.”
Norton, Steven. “Artificial Intelligence Looms Larger in the Corporate World.” The Wall Street Journal. Dow Jones & Company, 11 Jan. 2017. Web. 03 Mar. 2017. Web archive.↩
Chakma, Justin, Gordon H. Sun, Jeffrey D. Steinberg, Stephen M. Sammut, and Reshma Jagsi. “Asia’s Ascent — Global Trends in Biomedical R&D Expenditures.” New England Journal of Medicine 370.1 (2014): 3-6. Web.↩
For a fairly easy-to-understand example of this, see recent research by OpenAI describing how ‘altered photos can be used to manipulate machine learning systems in dangerous and unexpected ways. The paper Concrete Problems in AI Safety, authored by machine learning researchers at Google, OpenAI, and Stanford, surveys a number of technical problems “that are ready for experimentation today and relevant to the cutting edge of AI systems” but are “likely to be robustly useful across a broad variety of potential risks, both short- and long-term.”
Amodei, Dario, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, and Dan Mané. “Concrete Problems in AI Safety.” [1606.06565] Concrete Problems in AI Safety. Arxiv, 25 July 2016. Web archive.↩
Stuart Armstrong, Nick Bostrom, and Carl Shulman. “Racing to the Precipice: A Model of Artificial Intelligence Development.” AI & Society 31.2 (2015): 201-06. Web archive↩
Bostrom, N. (2016): “Strategic Implications of Openness in AI Development”, Technical Report #20161, Future of Humanity Institute, Oxford University: pp. 126. Web archive.↩
Nick Bostrom, Allan Dafoe and Carrick Flynn. “Policy Desiderata in the Development of Machine Superintelligence.” Working paper. Web archive.↩
Hanson, Robin. The Age of Em: Work, Love, and Life When Robots Rule the Earth. Oxford: Oxford University Press, 2016. Web.↩
Graph produced from Maddison, Angus (2007): “Contours of the World Economy, 1–2030 AD. Essays in Macro-Economic History”, Oxford University Press, ISBN 978-0-19-922721-1, p. 379, table A.4.↩
Long-Term and Short-Term Challenges to Ensuring the Safety of AI Systems. Jacob Steinhardt’s blog. Web archive.↩
Russell, Allan Dafoe and Stuart. “Yes, the experts are worried about the existential risk of artificial intelligence.” MIT Technology Review. November 02, 2016. Accessed 01 Mar. 2017. Web.↩