Clay Graubard and Robert de Neufville on forecasting the war in Ukraine

In this episode of 80k After Hours, Rob Wiblin interviews Clay Graubard and Robert de Neufville about forecasting the war between Russia and Ukraine.

They cover:

  • Their early predictions for the war
  • The performance of the Russian military
  • The risk of use of nuclear weapons
  • The most interesting remaining topics on Russia and Ukraine
  • General lessons we can take from the war
  • The evolution of the forecasting space
  • What Robert and Clay were reading back in February
  • Forecasters vs. subject matter experts
  • Ways to get involved with the forecasting community
  • Impressive past predictions
  • And more

Who this episode is for:

  • People interested in forecasting
  • People interested in the war in Ukraine
  • People who prefer to know how likely they are to die in a nuclear war

Who this episode isn’t for:

  • People who’d hate it if a friend said they were 65% likely to come out for drinks
  • People who’d prefer if their death from nuclear war was a total surprise

Get this episode by subscribing to our more experimental podcast on the world’s most pressing problems and how to solve them: type ’80k After Hours’ into your podcasting app. Or read the transcript below.

Producer: Keiran Harris
Audio mastering: Ben Cordell
Transcriptions: Katy Moore

Gershwin – Rhapsody in Blue, original 1924 version” by Jason Weinberger is licensed under creative commons

Highlights

Early predictions for the Russian invasion of Ukraine

Robert de Neufville: Back in January, I didn’t make good predictions about this. I wasn’t following it very closely. And I think Atief asked me on our podcast — this is one of the guys I podcast with — if I thought there was going to be an invasion. And my answer was basically, “I don’t think it makes sense. So probably not.” And I think that’s a lot of reasons why people were wrong about it, because they looked at it, even good forecasters I know, and said, “It doesn’t make a lot of sense to do this. So probably not.” I think that’s kind of a forecasting error. Sometimes people in intelligence talk about mirror imaging, which is to assume that the other people whose behavior you are trying to forecast think and see things the way you do. I think a lot of people did that with Putin and with Russia in this case.

Robert de Neufville: I also want to push back a little bit, because I think that kind of thinking can be useful, right? You do go around making basic assumptions that people are trying to do things that are smart, and playing a certain game, and so on. And it’d be really hard to forecast anything if you imagine people were just capable of doing random things at any given time.

Robert de Neufville: But I think a lot of good forecasters looked at it and said, “This is a strategic blunder. This is counterproductive. It’s not that likely.” And that was my initial reaction. Later, Russia kind of started to invade even before they crossed the lines. They were doing all the things you would want to do if you were really seriously going to invade. I guess that’s how you want to do a bluff too, but it was a really good bluff if that’s what it was. They were getting blood supplies and changing conscript rules and all sorts of little minor details.

Robert de Neufville: At some point I think all that stuff happening made me recognize this is really probably going to happen, even though I still had some doubts, which I’ve talked to Clay about. There was also an argument that Putin has been indicating, tipping his hands and talking about his ideology for years, and I think people haven’t been taking him very seriously. I didn’t follow that very closely initially, but in February I started to look at that and see this is actually something that he may really do. But I wish I could say I had done a better job earlier. Clay was the one who was on that.

Rob Wiblin: Yeah. Clay, how did you do?

Clay Graubard: By around January 1, I was 62% likelihood of an invasion. At the beginning of the month it rose quite precipitously as the invasion and the steps that Russia was taking were escalating. And then I decreased over the month of January, as you saw substantive Russia-US talks about European security, as well as an effort to revive the Normandy Format and have France and Germany play a role in getting Minsk II across the line with Zelenskyy and Putin.

Clay Graubard: By February those prospects had died. I was actually quite surprised at the degree to which the US actually took Russia’s concerns initially seriously, but nothing really ever came out of that. And actually the offer that Russia made, I don’t think it was actually seriously enough considered. But that’s probably another topic for another day. Suffice to say that at the start of February, I was at 88% chance that Russia was going to invade Ukraine. There were slight downs, but by February 12 I was 90-plus. And then from there, I think I hit 99-100% on February 20 or 21.

Rob Wiblin: And they invaded on the 24th, right?

Clay Graubard: Yes. Correct.

Rob Wiblin: So you were at 99% a few days before they did it?

Clay Graubard: Yeah. And I was struggling for quite a few days. We’re talking about getting rid of one world. Or, if we can say it’s 1,000 worlds, we’re getting rid of 10, but you’re making that very fine judgment. And that was really difficult. I don’t even know if it was worth the time, right? Is it worth 40 hours to go from 96 to 99? I don’t know. I did it. So yeah. But that was sort of the path.

The performance of the Russian military

Robert de Neufville: I think virtually everyone has been surprised that Russia hasn’t done better, but I didn’t necessarily expect Russia to be world-beaters. I guess I would’ve said that I thought their military was a little bit overrated, but also I think people underestimated how capable the Ukrainian military was. This is a large army, they’ve built it up. They’ve been modernizing it since the last Russian invasion, if not before. And they are a veteran group that has been fighting peer forces on their own territory for a long time.

Robert de Neufville: Probably in hindsight, I would’ve also thought more about morale. You have Russian soldiers who don’t even know they’re going to war initially, much less really have a conviction that it’s a really important thing. And on the other side, you have Ukrainians who are fighting to defend their home. And I think that is very important.

Robert de Neufville: So I thought Ukraine would hold out longer than some people did, but I didn’t see this coming. The US government supposedly estimated that Kyiv was going to fall within four days, and that was one of the longer amounts of times that people had estimated. I don’t know if I thought that was likely, because urban warfare is difficult and Ukraine would’ve had to really collapse, but I did not think that Kyiv would hold out indefinitely. And at this point it doesn’t seem like it’s about to fall at all. So that surprised me. I think we could have figured that out a little bit better for sure. But the experts whose information I was looking at for this mostly had the consensus that Russia would perform better, and I didn’t have a lot of outside knowledge to evaluate that claim, so I mostly believed it.

Clay Graubard: I think for me, I’m less surprised about how the Russian military has performed and rather the way in which Ukraine has been supported by the West. I think even the initial sanctions were more than what people were expecting, and they’ve only increased. Or whether it comes to the amount of military aid — whether that’s bullets, whether that’s anti-rockets, whether that’s now increasingly heavy equipment — or I think also very central is intelligence, right? We knew on day one that they were sharing signals intelligence and satellite intelligence with Ukraine. But there’s also been reports that there are US special forces on the ground giving intelligence, with the reports now that the US has been sharing intelligence to get high-value targets. That’s not something that you can just do if you don’t have anyone on the ground.

Clay Graubard: So you have that. You also have the US military doing wargames on how to defend against attack on Kyiv and really sort of working with the Ukrainian military. And I don’t think I had properly appreciated, first of all, to what extent the US military and government is going to be involved in the war in Ukraine; and then also, what is that impact going to be? Because the US is very good at figuring out logistics, is very good at getting intelligence and carrying out strikes. And I don’t think I factored that in.

Clay Graubard: On the other hand, I think the question of Russia has performed poorly: that’s relative to what? It’s been bad, but how bad has it been? It took NATO one month to get Baghdad, but Iraq’s half the size of Ukraine. It has a much smaller population and it’s much more concentrated. I think like 20-something percent of the population in Iraq lives in Baghdad, and 9% live in Kyiv. So it’s much more spread out, much more targets, much better military. In one day in Iraq, NATO launched 1,600 aircraft and fired 500 cruise missiles. It took Russia 10, 12 days to fire 500 cruise missiles. And if my memory of open source intelligence is correct, Russia only brought 300 aircraft to the border of Ukraine.

Clay Graubard: So it’s a very different military operation. And they’ve made a lot of gains in the south and the east. Yes, they’ve pulled back from the north, but they weren’t routed in the sense that on the way out, the convoys were just taken away. They mined a lot of the areas. So in the middle of a war, it’s very difficult to really get a sense on how these militaries are performing. And when I think the information is like, we only really see the Ukrainian POV of the war, which is very different than the lead-up, right? Before the war we only saw Russian military equipment moving to the border, training exercises that they were doing with Belarus. We saw nothing about how Ukraine was mobilizing or what they were doing. And now that the war has started, we see all of the Ukrainian victories, but we get very little about how the Russian military is performing.

Clay's take on the risk of use of nuclear weapons

Clay Graubard: I think the first way I try to think about it is like, what is the interesting forecasting question here? On a lot of platforms it’s like, “Will a nuclear weapon device be detonated by August 27?” or “Will it be done by 2023 or 2024?” I think that’s a very interesting tactical forecasting question — though I think that relies on a lot of inside information about how processes are working, under what timescale do things of escalation matter.

Clay Graubard: And when I try to think of what is the best way to use my time forecasting, especially something that’s dependent on so many future steps… Famous last words, but sub-1% we’re just going to go from today to Russia launching a nuclear weapon at Kyiv or London or anything like that. There’s a lot of steps along that way, and when those could possibly happen. So we get into this realm of system effects, where I just don’t think it’s really within the reasonable realm of the forecasting to really figure out that time component. It makes it really difficult.

Clay Graubard: So for me, I view it as, “What is the likelihood that Russia uses a nuclear weapon before either a peace deal gets signed or Kyiv falls?” — so this conflict turns into an insurgency or it comes to an end. That was the original way I looked at it. Given what’s happened on the battlefield, maybe it should be thought of a little bit differently. And sort of approaching that is, “What are the escalation ladders that lead to nuclear weapons being used?” One way that I thought of it happening is it starts off with Russia conducting an atmospheric nuclear test that doesn’t have any sort of direct impact.

Clay Graubard: So thinking through those various scenarios. I was talking with Michał Dubrawski about how long is this conflict going to last, barring a sudden collapse of Russia or a significant threat to the sovereignty of the Ukrainian government — whether that’s civil fighting between the different factions within society, which were very fractured before the invasion, or some sort of saboteurs, or Zelenskyy gets assassinated. Barring a rapid collapse, I could see this conflict easily lasting a full year or longer.

Clay Graubard: Under that timeframe, I would probably put my number at 12.5% right now. I could be persuaded down to 8%. I could see myself going higher as well. It’s a very uncomfortable question, but there are a lot of avenues that lead to it. And in our initial set of forecasts on Russia-Ukraine, I sort of ended it really panicked about how we were just nonchalantly escalating the conflict. I do think the pace at which the West has escalated has cooled off a bit, but the general direction is still of greater Western involvement.

Clay Graubard: There’s been more talk now about Russia making moves in Transnistria, which would get Moldova involved. I am, for instance, worried about China and Taiwan and a whole other set of nuclear flashpoints that exist. Given the war that’s going on right now, if something were to happen between Iran and Israel, the battle lines that would be drawn out would be in part driven by Russia-Ukraine. So is it part of that conflict then too?

Rob Wiblin: So just to clarify, when you’re saying 8% to 12%, most of that is the use of a tactical nuclear weapon in the war in Ukraine, I’m imagining. Right?

Clay Graubard: At least happening first.

Rob Wiblin: Right. So that’s how it starts.

Clay Graubard: That starts using it there. It’s more likely to have the nuclear weapon be used in Ukraine, keep the conflict in Ukraine, and sort of force the West to escalate and bring it into a larger conflict than having their first nuclear strike be in Poland or something.

Rob Wiblin: Yeah.

Clay Graubard: Or using a strategic nuclear weapon on London. Because you have to realize none of these leaders want a full-out nuclear war. They live in this world, their kids live in this world. We can’t rule out that they want to do that. They very much don’t want to get to that place. Whereas tactical nuclear weapons is different. This is something that the US has spent a lot of money on in the ’50s and ’60s, something that Russia has kept part of their military doctrine since. So yeah, more on the tactical side. But of course, Russia uses a tactical nuclear weapon and then that’s used as a basis for a no-fly zone, and then you could easily see how that could lead to larger nuclear conflict as well.

Roberts's take on the risk of use of nuclear weapons

Robert de Neufville: So before I was writing my forecasting, I worked for years at the Global Catastrophic Risk Institute, and one of the most plausible paths to a global catastrophe is a nuclear exchange. So we did some research on this. I looked particularly at close call incidents, just a variety of nuclear incidents. Some of them were pretty harrowing, but I actually came out of it thinking a lot of them were not really that close calls. But there’s a lot of potentially escalatory moments that you wouldn’t necessarily think of. For example, at one point during a war, Israel mistook a US research ship in the Mediterranean for an Egyptian destroyer and attacked it. They attacked a US ship, and the United States scrambled its available fighters without realizing initially that they were nuclear-armed fighters. So it scrambled nuclear-armed fighters to defend a US ship against Israel. I mean, this is a crazy kind of story. And they figured it out pretty quickly. They recalled the fighters. They talked to Israel and everything. But there are a bunch of these things where there were nuclear weapons in play in various places that you wouldn’t even think was a likely risk scenario.

Robert de Neufville: So a lot of my concern when the war started was that there are moments for these kinds of weird escalation mistakes, nuclear weapons in the wrong place. And this just creates so many more opportunities for things to go wrong — even though, as Clay says, nobody wants this. Russians love their children too. We all live in the same world. Nobody wants it destroyed. Well, maybe not nobody, but there aren’t a lot of death cults that have nuclear weapons. So just in general, I think this kind of friction with the US and Russia in close proximity on opposite sides raises the chances, to me, uncomfortably high.

Robert de Neufville: I also agree that the most likely scenario is some kind of tactical nuclear use in Ukraine. I initially thought that was reasonably likely. Now, having seen the way the war is going and having learned a little bit more about Russian nuclear policy, I think it would be pretty difficult even for Putin to actually make this happen. The danger, I suppose, is that if Russia is losing in a certain way, they might want to change the terms of the conflict or make a demonstration, change the scope. It doesn’t seem like a very good idea, but if you’re already losing, maybe you try a different way of losing, even though it might not be that rational.

Robert de Neufville: So that was my concern. As I say, I now think that is less likely than I initially thought. But there’s also a lot of pressure to escalate. We have already escalated in a number of ways — the way we’re supplying weapons and materiel to Ukraine — and NATO governments appear to be under a lot of public pressure to escalate more.

Robert de Neufville: I think that the Biden administration and most NATO leaders are pretty clear on not wanting to do a no-fly zone and other things that would potentially risk escalations, and the militaries are very aware of this kind of thing. But I worry if Russia commits some kind of atrocity or appears to have committed some kind of atrocity — I mean, they’ve already committed some atrocities, I think — but if there’s a chemical weapons attack, a clear chemical weapons attack, and there are a lot of bodies on TV, can NATO politicians resist this pressure? The Democrats in the US may be just creamed in the upcoming midterms. At some point, is there some pressure to look tough in the US Congress? These are kinds of things that potentially are riskier. And nobody wants to get to a nuclear war, but I think the small escalatory steps are a little bit scary and they increase the chances.

Robert de Neufville: I don’t know what timeframe we’re talking about 8% to 12% chance of nuclear use, but I do think there is some chance of a tactical nuclear use, maybe 1% or something. I don’t want to rule out too much because of the unknown unknowns we talked about. And the chance of an escalation today between Russia and US and NATO is a lot higher than it was last year because of this conflict. I don’t know exactly how high — there are some estimates and we could talk about that — but uncomfortably high. You really would prefer it to be lower.

Forecasters vs. subject matter experts

Clay Graubard: So I’ll just start off, first of all, and say that the attack on hedgehogs I think has been oversold. I think that hedgehogs have a lot of value, especially when you put in as much time as I might have done in a forecast, that actually having that theory background was helpful. I would say my background in international relations was helpful in doing the forecast. And definitely having an expert. It’s either I can go ahead and read journal articles and go do hours’ worth of research on a topic, or I can ask someone who’s done that for a job — or they’re the author of the paper, and instead of figuring out what they said, I could hop onto a 30-minute call and get probably way more information than I could just from reading their words on a sheet of paper.

Clay Graubard: So I think domain experts could make good forecasting. A lot of it is the process you take to making your forecast, right? Obviously having a forecast that is heavily rooted just in theory and in a very rigid worldview will do poorly. That is what expert political judgment said. But there was also a recent post on the EA Forum about how you can train domain experts to become good forecasters and they can reach Superforecaster levels of Brier score. It’s like, I want this piece of information. Now either I can go research it, or I can find someone that has that context. Then they can say, “Well, you’ve thought about this. Because of that you also want to consider this, because in this historical analogue that was relevant, or you’re just missing this piece of the puzzle.”

Robert de Neufville: I agree. Domain expertise doesn’t make you a bad forecaster. There are domain experts that are good at forecasting. There’s some risk that if you really are attached to a certain perspective, it’s going to be hard, but we have political scientists who are great at forecasting politics. I think someone who really has a single framework is the kind of hedgehog that maybe doesn’t do as well. I would like to have more discussion with people with expertise that I don’t have. I think there’s a project looking at existential risks, trying to forecast that, in which they’re going to be pairing domain experts with forecasters. I think that’s a good idea.

Robert de Neufville: I often wish I could ask specific questions of experts that I don’t know the answer to. I might be able to do the research, but maybe I wouldn’t get the right answer, and it would probably take me a long time. For the nuclear weapons use thing, I wanted to know, how does it work? That’s not a thing I knew; I wanted to talk to someone. That’s something we can kind of do. We can just play journalist or call a colleague and ask them that question, but there are specific questions. The other value of people that have expertise, that you may not have as a generalist forecaster, is that I want them to look at my rationale and say, “It looks like you missed something. Here’s a thing that you didn’t think of that everyone in my field knows, and you missed it and that might inform you.”

Robert de Neufville: I’m often not very interested in the probability estimates of domain experts. Not because they’re necessarily bad forecasters, just that most of them are. There are some that are good, but just because you know something doesn’t mean you’re good at estimating the likelihood that things will happen — it’s kind of a different skill. So I usually want answers to specific questions, and sort of sanity checks of my work. I think there was a response to the Samotsvety forecast by Peter Scoblic that was useful. I don’t know if I thought all of his probability ideas were great, but the information in there was a productive dialogue, I thought. So I do think that there are conversations to be had.

Rob Wiblin: He’s an expert on nuclear weapons in particular, and he was looking at this forecasting effort and saying, “Here’s what I would say, as someone who knows a bunch about this specific area?”

Robert de Neufville: Right. And I thought that not all of his insights were useful necessarily, but I thought the dialogue was productive. In general, I think that the more forecasters can connect with expert knowledge, the better the forecasts are going to be.

Articles, books, and other media discussed in the show

Robert’s work:

Clay’s work:

Getting into forecasting:

Future directions in forecasting:

Everything else:

Related episodes

About the show

80k After Hours is a podcast by the team that brings you The 80,000 Hours Podcast. Like that show, it mostly still explores the best ways to do good — and some episodes are even more laser-focused on careers than most original episodes. But we also widen our scope, including things like how to solve pressing problems while also living a happy and fulfilling life, as well as releases that are just fun, entertaining, or experimental. Get in touch with feedback or suggestions by emailing [email protected].

Subscribe here, or anywhere you get podcasts: