#54 – Askell, Brundage & Clark on whether policy has a hope of keeping up with AI advances

Dactyl is an AI system that can manipulate objects with a human-like robot hand. OpenAI Five is an AI system that can defeat humans at the video game Dota 2. The strange thing is they were both developed using the same general-purpose reinforcement learning algorithm.

How is this possible and what does it show?

In today’s interview Jack Clark, Policy Director at OpenAI, explains that from a computational perspective using a hand and playing Dota 2 are remarkably similar problems.

A robot hand needs to hold an object, move its fingers, and rotate it to the desired position. In Dota 2 you control a team of several different people, moving them around a map to attack an enemy.

Your hand has 20 or 30 different joints to move. The number of main actions in Dota 2 is 10 to 20, as you move your characters around a map.

When you’re rotating an objecting in your hand, you sense its friction, but you don’t directly perceive the entire shape of the object. In Dota 2, you’re unable to see the entire map and perceive what’s there by moving around — metaphorically ‘touching’ the space.

Read our new in-depth article on becoming an AI policy specialist: The case for building expertise to work on US AI policy, and how to do it

This is true of many apparently distinct problems in life. Compressing different sensory inputs down to a fundamental computational problem which we know how to solve only requires the right general purpose software.

OpenAI used an algorithm called Proximal Policy Optimization (PPO), which is fairly robust — in the sense that you can throw it at many different problems, not worry too much about tuning it, and it will do okay.

Jack emphasises that this algorithm wasn’t easy to create, and they were incredibly excited about it working on both tasks. But he also says that the creation of such increasingly ‘broad-spectrum’ algorithms has been the story of the last few years, and that the invention of software like PPO will have unpredictable consequences, heightening the huge challenges that already exist in AI policy.

Today’s interview is a mega-AI-policy-quad episode; Jack is joined by his colleagues Amanda Askell and Miles Brundage, on the day they released their fascinating and controversial large general language model GPT-2.

We discuss:

  • What are the most significant changes in the AI policy world over the last year or two?
  • How much is the field of AI policy still in the phase of just doing research and figuring out what should be done, versus actually trying to change things in the real world?
  • What capabilities are likely to develop over the next five, 10, 15, 20 years?
  • How much should we focus on the next couple of years, versus the next couple of decades?
  • How should we approach possible malicious uses of AI?
  • What are some of the potential ways OpenAI could make things worse, and how can they be avoided?
  • Publication norms for AI research
  • Where do we stand in terms of arms races between countries or different AI labs?
  • The case for creating a newsletter
  • Should the AI community have a closer relationship to the military?
  • Working at OpenAI vs. working in the US government
  • How valuable is Twitter in the AI policy world?

Rob is then joined by two of his colleagues — Niel Bowerman and Michelle Hutchinson — to quickly discuss:

  • The reaction to OpenAI’s release of GPT-2
  • Jack’s critique of our US AI policy article
  • How valuable are roles in government?
  • Where do you start if you want to write content for a specific audience?

Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type 80,000 Hours into your podcasting app. Or read the transcript below.

The 80,000 Hours Podcast is produced by Keiran Harris.

Highlights

Jack Clark: I’d say from my perspective that the politicization of AI, the realization among people taking part in AI, that it is a political technology that has political effects, has been very significant. We’ve seen that in work by employees at AI organizations like Google, Amazon, and Microsoft to push back on things like AI being used in drones in the case of Google or AI and facial recognition in the case of Microsoft and Amazon. And that’s happened alongside politicians realizing AI is important and is something they should legislate about. So you have Ben Sasse, who’s a Republican senator here in America, who has submitted a bill called the Deepfakes Prohibition Act, which is about stopping people using synthetic images for bad purposes.

I think the fact that AI has arrived as a cause of legislative concern at the same time that AI employees and practitioners are realizing that they are political agents in this regard and have the ability to condition the legislative conversation is quite significant. And I expect that next year and the year after, we’re going to see very significant changes, especially among western countries, as they realize that there’s a unique political dynamic at play here that means that it’s not just like a normal industry in your country, it’s something different.

Amanda Askell: I think some of the biggest changes I’ve seen have mainly been in a move from a pure problem framing to something more like a focus on a greater number of potential solutions and mechanisms for solving those problems, which I think is a very good change to see. Previously, I think there’s been a lot of pessimism around AI development among some people, and now we’re seeing really important ideas get out there like the idea of greater collaboration and cooperation, ways in which we can just ensure that the right amount of resources go into things like AI safety work and ensuring that systems are safe and beneficial. I think that one good thing is that there’s perhaps a little bit more optimism as a result of the fact that we’re now focusing more on mechanisms and solutions than just on trying to identify the key problems.

Jack Clark: I’d say that there’s huge room for translators, and I describe myself as that. Miles and Amanda are producing a lot of fundamental ideas that will be inherent to AI policy, and they’re also from time to time going and talking to policymakers or other parties about their research. I spend maybe half my time just going and talking to people and trying to translate, not just our ideas, but general ideas about technical trends in AI or impacts of AI to policymakers. And what I’ve discovered is that the traditional playbook for policy is to have someone who speaks policy, who’s kind of like a lobbyist or an ex-politician, and they talk to someone who speaks tech, who may be at the company’s home office or home base. And as a consequence, neither side is as informed as they could be.

The tech person that speaks tech doesn’t have a great idea of how the sausage gets made in Washington or Brussels or whatever, and the policy person who speaks policy doesn’t really have a deep appreciation for the technology and specifically for technology’s trajectory and likely areas of impact. And I found that just by being able to go into the room and say, “I’m here to talk about this tech, and I’m here to talk about the areas it may go over the next four to five years,” has been very helpful for a lot of policy people, because they think over that timeline, but they rarely get people giving them a coherent technical description of what’s going to happen.

Miles Brundage: And just to add one point, is that it’s important to distinguish things that we can be reasonably confident about or could be more confident about, like the sort of structural properties of AI as a technology to be governed, the idea that once you’ve trained the system, it’s easy to produce copies of it. That sort of has some social implications and implications for how you release things, like our decision today. Things like the fact that the upper limit on the speed of these systems is much higher than humans. You can see that with the case of GPT-2 in our announcement today. I mean what’s impressive both is that it’s producing these coherent samples, but also that it can do it at a superhuman rate and scale. So I think that we have to think not just about what is the sort of waterline of capabilities, but also what’s the sort of scale up from those to social impact, in terms of speed, quantity, et cetera.

Miles Brundage: I think the [short-term long-term] distinction is super overblown, and I’m guilty of having propagated this in among other places in an 80,000 Hours article a while back. But I think there are a bunch of good arguments that have been made for why there are at least some points of agreement between these different communities’ concerns around sort of the right publication norms and what role should governments play, and how do we avoid collective action problems and so forth.

So first of all they’re structurally similar things, and secondly they plausibly involve the same exact actors and possibly the same exact sort of policy steps. Like setting up connections between AI labs and managing compute, possibly. So there are these levers that I think are like pretty generic, and I think a lot of this distinction between short and long-term is sort of antiquated based on overly confident views of AI sort of not being a big deal until it suddenly is a huge deal. And I think we’re getting increasing amounts of evidence that we’re in a more sort of gradual world.

Miles Brundage: This is not the first case in which people haven’t published all of their results and model and code and so forth. What’s different is that the decision was A) Sort of made explicitly on the basis of these, misuse considerations and B) It was communicated in a transparent way that was aimed at fostering debate. So it wasn’t that no one’s ever worried about the social consequences of publishing before, but we took the additional step of trying to establish it as an explicit norm at AI Labs.

Jack Clark: Yeah. The way I think of it is: we built a lawn mower. We know that some percentage of the time this lawn mower drops a bit of oil on the lawn, which you don’t want to happen. Now, most lawn mower manufacturers would not lead their press strategy with: we’ve made a slightly leaky mower. That’s sort of what we did here, and I think the idea is to see what happens if you talk to the press about that aspect, ’cause we know that they think that aspect is undercovered. So if we can go over to them and say we know that you think this aspect is undercovered, here’s a way for you to talk about it and here’s a way for you to talk to a character that’s now animating that issue for you. Maybe we can get a better discussion to come out the other side. That’s the hypothesis.

Amanda Askell: Yeah. And I think that one thing that’s worth noting is it’s important to be as honest as you can be in this domain and just in life in general. I think here, honesty is what we’ve kind of aimed for in that we’re saying like we don’t feel comfortable releasing the model, but we’re telling you that. I think that’s also important. It’s not something where we’re trying to actively deceive people or we’re trying to be more closed. I think one way in which you can be honest, is just telling people what your intentions are, why you’re thinking about it and how you’re thinking about it, so that even if they disagree, they can see your decision process roughly and why you’re doing what you’re doing and I think that’s important.

Jack Clark: I think that we don’t talk enough about militaries and AI. We talk about militaries now in terms of value judgments about what we do or don’t want militaries to do with AI. I’m interested in organizations like OpenAI and others talking about how when militaries eventually decide to do something, the AI community is in a position to make whatever it is those militaries do safe, in a way that actually makes sense to the militaries.

I think we are currently potentially underinvesting in that out of a reasonable hypothesis, but if we are going to talk to militaries, there’s a lot of information that could leak over to them, and there’s a lot of information hazard there.

I recognize those concerns, but I think if we basically just never talked to the militaries, essentially treat the militaries like an adversary, in their own rights, then when the time comes that the U.S. And China are creating highly autonomous drone swarms in the South China Sea.

We have certain ideas around safety that we really think they should embed into those platforms. They may not listen to us, and that would actually be uniquely destabilizing, and maybe one of the places where these arms race dynamics could rapidly crystallize. So, I think it’s important for people to bear in mind that there are already kind of stigmas emerging in the AI community about what is and isn’t acceptable speech with regard to AI policy and AI actors.

We should just be cognizant of that and try to talk about all of the different actors inclusively.

Jack Clark: I’m going to make a very specific plug, which is incredibly biased. So, I’ll lay out the bias and I’ll describe the plug. So, the bias is, in my spare time I write an AI newsletter called Import AI. So, I am biased towards newsletters being useful.

So, there’s my bias. The plug is, we’ve seen a number of people start to write policy specific AI newsletters. There’s a newsletter on European AI from Charlotte Stix. There’s a newsletter on Indian AI from someone at the Berkman Institute at Harvard.

It’s just started. My colleague, Matthew van der Merwe, who I believe is with the Future of Humanity Institute, writes a policy section within my newsletter, Import AI.

I know that any congressional staffer, any staffer for any politician in any country I’ve been to, has actually made mention of needing more materials to read to get them into AI and AI policy.And I think this is a very high leverage area, where if you are interested in AI policy, just trying to produce something that’s useful to those people, which summarizes AI and its relevance to policy within a specific tightly scoped domain, will not only give you the ability to calibrate your own thinking and generate evidence, but it will allow you to make friends with the very people you may want to work with.

I think it’s unbelievably useful and high leverage and everyone should do this more.

Articles, books, and other media discussed in the show

The case for building expertise to work on US AI policy, and how to do it by Niel Bowerman for 80,000 Hours

OpenAI reports

Other articles discussed

Related episodes

About the show

The 80,000 Hours Podcast features unusually in-depth conversations about the world's most pressing problems and how you can use your career to solve them. We invite guests pursuing a wide range of career paths — from academics and activists to entrepreneurs and policymakers — to analyse the case for and against working on different issues and which approaches are best for solving them.

Get in touch with feedback or guest suggestions by emailing [email protected].

What should I listen to first?

We've carefully selected 10 episodes we think it could make sense to listen to first, on a separate podcast feed:

Check out 'Effective Altruism: An Introduction'

Subscribe here, or anywhere you get podcasts:

If you're new, see the podcast homepage for ideas on where to start, or browse our full episode archive.