Enjoyed the episode? Want to listen later? Subscribe by searching for 80k After Hours wherever you get podcasts, or click one of the buttons below:

By around January 1, I was 62% likelihood of an invasion. It rose quite precipitously as the steps that Russia was taking were escalating. And then I decreased over the month of January, as you saw substantive Russia-US talks about European security…

At the start of February, I was at 88%. There were slight downs, but by February 12 I was 90-plus. And then from there, I think I hit 99-100% on February 20 or 21.

Clay Graubard

In this episode of 80k After Hours, Rob Wiblin interviews Clay Graubard and Robert de Neufville about forecasting the war between Russia and Ukraine.

They cover:

  • Their early predictions for the war
  • The performance of the Russian military
  • The risk of use of nuclear weapons
  • The most interesting remaining topics on Russia and Ukraine
  • General lessons we can take from the war
  • The evolution of the forecasting space
  • What Robert and Clay were reading back in February
  • Forecasters vs. subject matter experts
  • Ways to get involved with the forecasting community
  • Impressive past predictions
  • And more

Who this episode is for:

  • People interested in forecasting
  • People interested in the war in Ukraine
  • People who prefer to know how likely they are to die in a nuclear war

Who this episode isn’t for:

  • People who’d hate it if a friend said they were 65% likely to come out for drinks
  • People who’d prefer if their death from nuclear war was a total surprise

Get this episode by subscribing to our more experimental podcast on the world’s most pressing problems and how to solve them: type ‘80k After Hours’ into your podcasting app. Or read the transcript below.

Producer: Keiran Harris
Audio mastering: Ben Cordell
Transcriptions: Katy Moore

Gershwin – Rhapsody in Blue, original 1924 version” by Jason Weinberger is licensed under creative commons

Highlights

Early predictions for the Russian invasion of Ukraine

Robert de Neufville: Back in January, I didn’t make good predictions about this. I wasn’t following it very closely. And I think Atief asked me on our podcast — this is one of the guys I podcast with — if I thought there was going to be an invasion. And my answer was basically, “I don’t think it makes sense. So probably not.” And I think that’s a lot of reasons why people were wrong about it, because they looked at it, even good forecasters I know, and said, “It doesn’t make a lot of sense to do this. So probably not.” I think that’s kind of a forecasting error. Sometimes people in intelligence talk about mirror imaging, which is to assume that the other people whose behavior you are trying to forecast think and see things the way you do. I think a lot of people did that with Putin and with Russia in this case.

Robert de Neufville: I also want to push back a little bit, because I think that kind of thinking can be useful, right? You do go around making basic assumptions that people are trying to do things that are smart, and playing a certain game, and so on. And it’d be really hard to forecast anything if you imagine people were just capable of doing random things at any given time.

Robert de Neufville: But I think a lot of good forecasters looked at it and said, “This is a strategic blunder. This is counterproductive. It’s not that likely.” And that was my initial reaction. Later, Russia kind of started to invade even before they crossed the lines. They were doing all the things you would want to do if you were really seriously going to invade. I guess that’s how you want to do a bluff too, but it was a really good bluff if that’s what it was. They were getting blood supplies and changing conscript rules and all sorts of little minor details.

Robert de Neufville: At some point I think all that stuff happening made me recognize this is really probably going to happen, even though I still had some doubts, which I’ve talked to Clay about. There was also an argument that Putin has been indicating, tipping his hands and talking about his ideology for years, and I think people haven’t been taking him very seriously. I didn’t follow that very closely initially, but in February I started to look at that and see this is actually something that he may really do. But I wish I could say I had done a better job earlier. Clay was the one who was on that.

Rob Wiblin: Yeah. Clay, how did you do?

Clay Graubard: By around January 1, I was 62% likelihood of an invasion. At the beginning of the month it rose quite precipitously as the invasion and the steps that Russia was taking were escalating. And then I decreased over the month of January, as you saw substantive Russia-US talks about European security, as well as an effort to revive the Normandy Format and have France and Germany play a role in getting Minsk II across the line with Zelenskyy and Putin.

Clay Graubard: By February those prospects had died. I was actually quite surprised at the degree to which the US actually took Russia’s concerns initially seriously, but nothing really ever came out of that. And actually the offer that Russia made, I don’t think it was actually seriously enough considered. But that’s probably another topic for another day. Suffice to say that at the start of February, I was at 88% chance that Russia was going to invade Ukraine. There were slight downs, but by February 12 I was 90-plus. And then from there, I think I hit 99-100% on February 20 or 21.

Rob Wiblin: And they invaded on the 24th, right?

Clay Graubard: Yes. Correct.

Rob Wiblin: So you were at 99% a few days before they did it?

Clay Graubard: Yeah. And I was struggling for quite a few days. We’re talking about getting rid of one world. Or, if we can say it’s 1,000 worlds, we’re getting rid of 10, but you’re making that very fine judgment. And that was really difficult. I don’t even know if it was worth the time, right? Is it worth 40 hours to go from 96 to 99? I don’t know. I did it. So yeah. But that was sort of the path.

The performance of the Russian military

Robert de Neufville: I think virtually everyone has been surprised that Russia hasn’t done better, but I didn’t necessarily expect Russia to be world-beaters. I guess I would’ve said that I thought their military was a little bit overrated, but also I think people underestimated how capable the Ukrainian military was. This is a large army, they’ve built it up. They’ve been modernizing it since the last Russian invasion, if not before. And they are a veteran group that has been fighting peer forces on their own territory for a long time.

Robert de Neufville: Probably in hindsight, I would’ve also thought more about morale. You have Russian soldiers who don’t even know they’re going to war initially, much less really have a conviction that it’s a really important thing. And on the other side, you have Ukrainians who are fighting to defend their home. And I think that is very important.

Robert de Neufville: So I thought Ukraine would hold out longer than some people did, but I didn’t see this coming. The US government supposedly estimated that Kyiv was going to fall within four days, and that was one of the longer amounts of times that people had estimated. I don’t know if I thought that was likely, because urban warfare is difficult and Ukraine would’ve had to really collapse, but I did not think that Kyiv would hold out indefinitely. And at this point it doesn’t seem like it’s about to fall at all. So that surprised me. I think we could have figured that out a little bit better for sure. But the experts whose information I was looking at for this mostly had the consensus that Russia would perform better, and I didn’t have a lot of outside knowledge to evaluate that claim, so I mostly believed it.

Clay Graubard: I think for me, I’m less surprised about how the Russian military has performed and rather the way in which Ukraine has been supported by the West. I think even the initial sanctions were more than what people were expecting, and they’ve only increased. Or whether it comes to the amount of military aid — whether that’s bullets, whether that’s anti-rockets, whether that’s now increasingly heavy equipment — or I think also very central is intelligence, right? We knew on day one that they were sharing signals intelligence and satellite intelligence with Ukraine. But there’s also been reports that there are US special forces on the ground giving intelligence, with the reports now that the US has been sharing intelligence to get high-value targets. That’s not something that you can just do if you don’t have anyone on the ground.

Clay Graubard: So you have that. You also have the US military doing wargames on how to defend against attack on Kyiv and really sort of working with the Ukrainian military. And I don’t think I had properly appreciated, first of all, to what extent the US military and government is going to be involved in the war in Ukraine; and then also, what is that impact going to be? Because the US is very good at figuring out logistics, is very good at getting intelligence and carrying out strikes. And I don’t think I factored that in.

Clay Graubard: On the other hand, I think the question of Russia has performed poorly: that’s relative to what? It’s been bad, but how bad has it been? It took NATO one month to get Baghdad, but Iraq’s half the size of Ukraine. It has a much smaller population and it’s much more concentrated. I think like 20-something percent of the population in Iraq lives in Baghdad, and 9% live in Kyiv. So it’s much more spread out, much more targets, much better military. In one day in Iraq, NATO launched 1,600 aircraft and fired 500 cruise missiles. It took Russia 10, 12 days to fire 500 cruise missiles. And if my memory of open source intelligence is correct, Russia only brought 300 aircraft to the border of Ukraine.

Clay Graubard: So it’s a very different military operation. And they’ve made a lot of gains in the south and the east. Yes, they’ve pulled back from the north, but they weren’t routed in the sense that on the way out, the convoys were just taken away. They mined a lot of the areas. So in the middle of a war, it’s very difficult to really get a sense on how these militaries are performing. And when I think the information is like, we only really see the Ukrainian POV of the war, which is very different than the lead-up, right? Before the war we only saw Russian military equipment moving to the border, training exercises that they were doing with Belarus. We saw nothing about how Ukraine was mobilizing or what they were doing. And now that the war has started, we see all of the Ukrainian victories, but we get very little about how the Russian military is performing.

Clay’s take on the risk of use of nuclear weapons

Clay Graubard: I think the first way I try to think about it is like, what is the interesting forecasting question here? On a lot of platforms it’s like, “Will a nuclear weapon device be detonated by August 27?” or “Will it be done by 2023 or 2024?” I think that’s a very interesting tactical forecasting question — though I think that relies on a lot of inside information about how processes are working, under what timescale do things of escalation matter.

Clay Graubard: And when I try to think of what is the best way to use my time forecasting, especially something that’s dependent on so many future steps… Famous last words, but sub-1% we’re just going to go from today to Russia launching a nuclear weapon at Kyiv or London or anything like that. There’s a lot of steps along that way, and when those could possibly happen. So we get into this realm of system effects, where I just don’t think it’s really within the reasonable realm of the forecasting to really figure out that time component. It makes it really difficult.

Clay Graubard: So for me, I view it as, “What is the likelihood that Russia uses a nuclear weapon before either a peace deal gets signed or Kyiv falls?” — so this conflict turns into an insurgency or it comes to an end. That was the original way I looked at it. Given what’s happened on the battlefield, maybe it should be thought of a little bit differently. And sort of approaching that is, “What are the escalation ladders that lead to nuclear weapons being used?” One way that I thought of it happening is it starts off with Russia conducting an atmospheric nuclear test that doesn’t have any sort of direct impact.

Clay Graubard: So thinking through those various scenarios. I was talking with Michał Dubrawski about how long is this conflict going to last, barring a sudden collapse of Russia or a significant threat to the sovereignty of the Ukrainian government — whether that’s civil fighting between the different factions within society, which were very fractured before the invasion, or some sort of saboteurs, or Zelenskyy gets assassinated. Barring a rapid collapse, I could see this conflict easily lasting a full year or longer.

Clay Graubard: Under that timeframe, I would probably put my number at 12.5% right now. I could be persuaded down to 8%. I could see myself going higher as well. It’s a very uncomfortable question, but there are a lot of avenues that lead to it. And in our initial set of forecasts on Russia-Ukraine, I sort of ended it really panicked about how we were just nonchalantly escalating the conflict. I do think the pace at which the West has escalated has cooled off a bit, but the general direction is still of greater Western involvement.

Clay Graubard: There’s been more talk now about Russia making moves in Transnistria, which would get Moldova involved. I am, for instance, worried about China and Taiwan and a whole other set of nuclear flashpoints that exist. Given the war that’s going on right now, if something were to happen between Iran and Israel, the battle lines that would be drawn out would be in part driven by Russia-Ukraine. So is it part of that conflict then too?

Rob Wiblin: So just to clarify, when you’re saying 8% to 12%, most of that is the use of a tactical nuclear weapon in the war in Ukraine, I’m imagining. Right?

Clay Graubard: At least happening first.

Rob Wiblin: Right. So that’s how it starts.

Clay Graubard: That starts using it there. It’s more likely to have the nuclear weapon be used in Ukraine, keep the conflict in Ukraine, and sort of force the West to escalate and bring it into a larger conflict than having their first nuclear strike be in Poland or something.

Rob Wiblin: Yeah.

Clay Graubard: Or using a strategic nuclear weapon on London. Because you have to realize none of these leaders want a full-out nuclear war. They live in this world, their kids live in this world. We can’t rule out that they want to do that. They very much don’t want to get to that place. Whereas tactical nuclear weapons is different. This is something that the US has spent a lot of money on in the ’50s and ’60s, something that Russia has kept part of their military doctrine since. So yeah, more on the tactical side. But of course, Russia uses a tactical nuclear weapon and then that’s used as a basis for a no-fly zone, and then you could easily see how that could lead to larger nuclear conflict as well.

Roberts’s take on the risk of use of nuclear weapons

Robert de Neufville: So before I was writing my forecasting, I worked for years at the Global Catastrophic Risk Institute, and one of the most plausible paths to a global catastrophe is a nuclear exchange. So we did some research on this. I looked particularly at close call incidents, just a variety of nuclear incidents. Some of them were pretty harrowing, but I actually came out of it thinking a lot of them were not really that close calls. But there’s a lot of potentially escalatory moments that you wouldn’t necessarily think of. For example, at one point during a war, Israel mistook a US research ship in the Mediterranean for an Egyptian destroyer and attacked it. They attacked a US ship, and the United States scrambled its available fighters without realizing initially that they were nuclear-armed fighters. So it scrambled nuclear-armed fighters to defend a US ship against Israel. I mean, this is a crazy kind of story. And they figured it out pretty quickly. They recalled the fighters. They talked to Israel and everything. But there are a bunch of these things where there were nuclear weapons in play in various places that you wouldn’t even think was a likely risk scenario.

Robert de Neufville: So a lot of my concern when the war started was that there are moments for these kinds of weird escalation mistakes, nuclear weapons in the wrong place. And this just creates so many more opportunities for things to go wrong — even though, as Clay says, nobody wants this. Russians love their children too. We all live in the same world. Nobody wants it destroyed. Well, maybe not nobody, but there aren’t a lot of death cults that have nuclear weapons. So just in general, I think this kind of friction with the US and Russia in close proximity on opposite sides raises the chances, to me, uncomfortably high.

Robert de Neufville: I also agree that the most likely scenario is some kind of tactical nuclear use in Ukraine. I initially thought that was reasonably likely. Now, having seen the way the war is going and having learned a little bit more about Russian nuclear policy, I think it would be pretty difficult even for Putin to actually make this happen. The danger, I suppose, is that if Russia is losing in a certain way, they might want to change the terms of the conflict or make a demonstration, change the scope. It doesn’t seem like a very good idea, but if you’re already losing, maybe you try a different way of losing, even though it might not be that rational.

Robert de Neufville: So that was my concern. As I say, I now think that is less likely than I initially thought. But there’s also a lot of pressure to escalate. We have already escalated in a number of ways — the way we’re supplying weapons and materiel to Ukraine — and NATO governments appear to be under a lot of public pressure to escalate more.

Robert de Neufville: I think that the Biden administration and most NATO leaders are pretty clear on not wanting to do a no-fly zone and other things that would potentially risk escalations, and the militaries are very aware of this kind of thing. But I worry if Russia commits some kind of atrocity or appears to have committed some kind of atrocity — I mean, they’ve already committed some atrocities, I think — but if there’s a chemical weapons attack, a clear chemical weapons attack, and there are a lot of bodies on TV, can NATO politicians resist this pressure? The Democrats in the US may be just creamed in the upcoming midterms. At some point, is there some pressure to look tough in the US Congress? These are kinds of things that potentially are riskier. And nobody wants to get to a nuclear war, but I think the small escalatory steps are a little bit scary and they increase the chances.

Robert de Neufville: I don’t know what timeframe we’re talking about 8% to 12% chance of nuclear use, but I do think there is some chance of a tactical nuclear use, maybe 1% or something. I don’t want to rule out too much because of the unknown unknowns we talked about. And the chance of an escalation today between Russia and US and NATO is a lot higher than it was last year because of this conflict. I don’t know exactly how high — there are some estimates and we could talk about that — but uncomfortably high. You really would prefer it to be lower.

Forecasters vs. subject matter experts

Clay Graubard: So I’ll just start off, first of all, and say that the attack on hedgehogs I think has been oversold. I think that hedgehogs have a lot of value, especially when you put in as much time as I might have done in a forecast, that actually having that theory background was helpful. I would say my background in international relations was helpful in doing the forecast. And definitely having an expert. It’s either I can go ahead and read journal articles and go do hours’ worth of research on a topic, or I can ask someone who’s done that for a job — or they’re the author of the paper, and instead of figuring out what they said, I could hop onto a 30-minute call and get probably way more information than I could just from reading their words on a sheet of paper.

Clay Graubard: So I think domain experts could make good forecasting. A lot of it is the process you take to making your forecast, right? Obviously having a forecast that is heavily rooted just in theory and in a very rigid worldview will do poorly. That is what expert political judgment said. But there was also a recent post on the EA Forum about how you can train domain experts to become good forecasters and they can reach Superforecaster levels of Brier score. It’s like, I want this piece of information. Now either I can go research it, or I can find someone that has that context. Then they can say, “Well, you’ve thought about this. Because of that you also want to consider this, because in this historical analogue that was relevant, or you’re just missing this piece of the puzzle.”

Robert de Neufville: I agree. Domain expertise doesn’t make you a bad forecaster. There are domain experts that are good at forecasting. There’s some risk that if you really are attached to a certain perspective, it’s going to be hard, but we have political scientists who are great at forecasting politics. I think someone who really has a single framework is the kind of hedgehog that maybe doesn’t do as well. I would like to have more discussion with people with expertise that I don’t have. I think there’s a project looking at existential risks, trying to forecast that, in which they’re going to be pairing domain experts with forecasters. I think that’s a good idea.

Robert de Neufville: I often wish I could ask specific questions of experts that I don’t know the answer to. I might be able to do the research, but maybe I wouldn’t get the right answer, and it would probably take me a long time. For the nuclear weapons use thing, I wanted to know, how does it work? That’s not a thing I knew; I wanted to talk to someone. That’s something we can kind of do. We can just play journalist or call a colleague and ask them that question, but there are specific questions. The other value of people that have expertise, that you may not have as a generalist forecaster, is that I want them to look at my rationale and say, “It looks like you missed something. Here’s a thing that you didn’t think of that everyone in my field knows, and you missed it and that might inform you.”

Robert de Neufville: I’m often not very interested in the probability estimates of domain experts. Not because they’re necessarily bad forecasters, just that most of them are. There are some that are good, but just because you know something doesn’t mean you’re good at estimating the likelihood that things will happen — it’s kind of a different skill. So I usually want answers to specific questions, and sort of sanity checks of my work. I think there was a response to the Samotsvety forecast by Peter Scoblic that was useful. I don’t know if I thought all of his probability ideas were great, but the information in there was a productive dialogue, I thought. So I do think that there are conversations to be had.

Rob Wiblin: He’s an expert on nuclear weapons in particular, and he was looking at this forecasting effort and saying, “Here’s what I would say, as someone who knows a bunch about this specific area?”

Robert de Neufville: Right. And I thought that not all of his insights were useful necessarily, but I thought the dialogue was productive. In general, I think that the more forecasters can connect with expert knowledge, the better the forecasts are going to be.

Articles, books, and other media discussed in the show

Robert’s work:

Clay’s work:

Getting into forecasting:

Future directions in forecasting:

Everything else:

Transcript

Keiran’s intro [00:00:00]

Keiran Harris: Welcome to 80k After Hours. I’m Keiran Harris — producer of the show, and owner of the biggest beanie-baby collection in the world.

Today’s episode is the result of an experiment, where we wanted to test the question: “Can Rob Wiblin record a decent episode without prepping much at all?”

Turns out he can — though most of that credit should go to our two guests, Robert and Clay, who didn’t really need Rob to steer the conversation in interesting directions.

It’s all about forecasting, and specifically forecasting the war between Russia and Ukraine.

They get into:

  • Their early predictions for the war
  • The performance of the Russian military
  • The risk of use of nuclear weapons
  • The evolution of the forecasting space
  • And much more

They don’t pause to define common forecasting terms — one of the benefits of this feed is that we can be more comfortable with assuming prior knowledge — but as long as you have some interest in the war in Ukraine, or in the risk of nuclear war, I think you’ll probably find it engaging.

And this episode was recorded on the 27th of April, 2022.

Alright, here’s Rob, Robert, and Clay.

Current work for Robert and Clay [00:01:11]

Rob Wiblin: Today, I’m joined by Clay Graubard and Robert de Neufville, two people who are heavily involved in the Superforecasting scene and who have been tracking and writing a lot about the Russian invasion of Ukraine lately. We wanted to have a chat to learn about how they are thinking about the invasion of Ukraine, how they make predictions in general, and also learn a bit more about what is happening in the forecasting scene — which, from what I’m reading on Twitter, seems to be having a bit of a renaissance at the moment. So thanks for joining, both of you.

Robert de Neufville: Thank you.

Clay Graubard: Thanks. Great to be here.

Rob Wiblin: Maybe let’s just first do some quick introductions. Rob, what are you working on at the moment? What hats do you wear?

Robert de Neufville: I am mostly wearing one single hat right now. I recently got a grant from the EA Infrastructure Fund to write a Substack making forecasts about forecasting research. That’s pretty exciting. I’ve just started that. I was going to launch a little bit later than I did, but it looked like there was going to be a war in Ukraine, so I was like, “If I’m going to forecast this question, I’d better get a forecast out soon.” So I’ve been doing that, still trying to figure it out, but that takes most of my time. I also do still forecast for Good Judgment Inc., which is the company that spun off of the original Good Judgment Project in the IARPA tournament.

Rob Wiblin: Yeah. How did you get into this whole line of work?

Robert de Neufville: Well, it’s funny because it wasn’t really a line of work until recently. There has been a kind of renaissance in forecasting — I think partly because there’s a lot more funding interest in it. But it was never really a job, at least the kind of judgmental forecasting that I do. As far as how I got into it, I’ve always estimated the probability of things. I think I do it kind of as a neurotic response. If I’m anxious about something, I want a realistic estimate of how likely it is to happen. I would do it when my team was playing a basketball game in the playoffs or something — I’m trying to estimate the chances they have of winning or losing. It makes me feel calmer.

Robert de Neufville: I started doing that. I won some money on some early fantasy sports site. So then when I saw they were doing this tournament — I think it was the third year I heard about it — I just joined the tournament and did quite well in my first year, and was qualified as a Superforecaster after that. And I’ve just kept doing it because it’s fun. I guess it’s fun (for me anyway), but also I find it really intellectually interesting. So now that I have an opportunity to actually do this professionally, to write about it, that’s really exciting for me.

Rob Wiblin: Yeah. Fantastic. It’s really nice that you’ve gotten a grant from the Effective Altruism Infrastructure Fund too. So your Substack is called Telling the Future, right?

Robert de Neufville: Yeah.

Rob Wiblin: tellingthefuture.substack.com. I’ve had a chance to read a bit of that today, and it’s good stuff. Listeners should go check it out.

Robert de Neufville: Thank you.

Rob Wiblin: The Good Judgment Project, that’s Tetlock‘s research group or competition group?

Robert de Neufville: That’s right.

Rob Wiblin: That’s back from their IARPA research, right? When they were doing work for the intelligence community to figure out how they could predict things better?

Robert de Neufville: Yeah, that’s right. IARPA. Essentially they set out to figure out if anyone could produce good forecasts that were substantially better than random, than throwing darts. Originally it was a bunch of different teams, but Tetlock’s team kind of dominated. His approach was just to find good forecasters, and there are some techniques you can use for figuring out who is a good forecaster in advance. But basically, they would find who consistently did well, and that created this whole idea of Superforecasters. Some people are capable of consistently outperforming even subject matter experts — who as a group aren’t particularly good, even intelligence analysts with access to classified information. Tetlock has done a lot of stuff, but that’s the thing he’s known for, and it came out of that research. So I participated essentially as a lab rat in that, and then have been doing it ever since.

Rob Wiblin: So Tetlock and his group came up with this term of “Superforecasters” — I think that’s a title or a credential that they give to folks who consistently perform in the top 2% of their forecasting pool. Is that right?

Robert de Neufville: Yeah. I believe that’s right. I don’t actually know how they do the standards of the credentials. They originally came out of the government-sponsored tournament. They still are qualifying people as Superforecasters through their Good Judgment Open, which is an open forecasting platform. I don’t know exactly how they do it. It is a trademark, Good Judgment. The business does quite well because they’re the only ones who are allowed to call people that. But in fact, there are a lot of good forecasters who have simply not been identified or given the credential, at Metaculus and elsewhere. So I have the formal credential, but you know, as a theoretical term. Some people are better at forecasting and have developed a skill for it.

Rob Wiblin: All right. Clay, your turn. What are you working on these days?

Clay Graubard: So unlike Robert, I’m currently wearing two hats right now. On one hand, I’m a master’s student at Oxford studying international relations, and I’m in my last year. So a lot of my time right now is going into my thesis, which looks at geopolitical forecasting and the use of international relations theory.

Clay Graubard: And then on the other side, I work with my cofounder Andrew Eaddy on baserate.io, where we publish Global Guessing, which is a website where we do geopolitical forecasting on Russia’s invasion of Ukraine, on the JCPOA — doing our own forecasts, talking to forecasters, and aggregating crowd forecasts. Which we’ve been doing a lot for Russia-Ukraine, whether it’s predictions coming from websites like Metaculus, Good Judgment, and Hypermind or aggregating signals that we find on Twitter. Then we also work on Crowd Money, which is our publication on prediction markets, which sort of takes this aspect of quantified forecasting and brings market mechanics and real money into the equation as well.

Rob Wiblin: How do you fund it? Is it part of your academic work or is it a passion project?

Clay Graubard: It started off very much as a passion project. I got into this idea of Superforecasting because I took a course by Allan Dafoe, where he said to read the introduction to Superforecasting. And I was like, “This book is fantastic; this idea is incredible.” As someone who’s interested in scientific method and theory in the social sciences, it seemed like this really great technology and method that wasn’t being used.

Clay Graubard: So I read the entire book, then I shared it with Andrew Eaddy, because this was all during lockdown. And we just started off like, “Let’s just start making our own forecasts to get better at it.” Over time, the practice of it is really fun. It’s really engaging in terms of mental capacity and the different skills that you have to use. And then over time, it has sort of morphed into fitting into my academic work and waging a semi-long campaign to convince my advisor that forecasting research is something that actually I can do at Oxford, even if the student handbook maybe says it’s not entirely allowed.

Rob Wiblin: Very cool. How much time do you get to actually spend researching specific questions that might be on one of these forecasting platforms, to try to come up with a number and then sticking it in?

Clay Graubard: So for me, I think I take a different approach to forecasting than I think probably Robert does. Because if you forecast in the Good Judgment Project, you are being asked hundreds of questions every single year, over multiple years. And so the amount of time you could spend on each question is quite limited. Personally, the way in which I like to approach questions is really spending a lot of time on the forecast.

Clay Graubard: Like for Russia-Ukraine, my initial forecast took hours. But towards the end I was spending eight, nine hours a day forecasting the question: trying to find signals, trying to think of all the different possible worlds and the different factors that would make them open up or make them close and sort of change the probability. So spending a lot of time on-task. That was at the peak.

Clay Graubard: Right now, as I’m working on my thesis, I probably spend two hours a day working on forecasting. Then I also spend a lot of time reading the academic literature. I think there’s a lot of things as a forecaster you can get from reading it, but then also just thinking about it as this academic field and this intellectual pursuit, I find very interesting. So I try to read all of that literature as well.

Rob Wiblin: I know there’s a whole bunch of different platforms where people can contribute their forecasts of the probability of different events. Metaculus is the one that I see mentioned most often, but are there other ones that people in the audience should have in mind?

Clay Graubard: If you’re interested in geopolitical questions primarily, I think Good Judgment Open does a very good job when it comes to creating a very high ratio of good questions. Metaculus has thousands of questions active at every single time and they span a very wide range. So if you’re interested in EA, you’re probably going to find a lot of other questions on Metaculus that are interesting — about meat alternatives, about AGI — about tons of different topics that you won’t get on Good Judgment Open. I also think Hypermind has very good forecasts on it. Again, this is a website focused on geopolitical questions, and I’ve noticed that the quality of information there is very good.

Clay Graubard: If you’re in the DoD defense space, INFER has a lot of very interesting questions there. It is very niche. It’s a website that’s largely funded by the US government. And then if you’re into culture stuff — crypto, more short forecasts — I think the prediction markets can be really fun to watch. And when they do have overlap with Good Judgment Open or Metaculus, it can be really interesting to see how does PredictIt view this versus Metaculus, and tracking those two movements as well. So there is a very wide range of platforms out there right now.

NonProphets and Global Guessing [00:11:30]

Rob Wiblin: Both of you also work from time to time together on this podcast called NonProphets — that’s profits with a ph. I like the title. Robert, can you tell us a little bit about what people might find on there if they subscribed and what the vision for that show was?

Robert de Neufville: Sure. First of all I should say that we do separate podcasts. We’ve been talking about doing some kind of a cross-podcast event for the two of us. So NonProphets, I think it was originally described as “a super casual podcast,” which is to say that we sometimes prepare a lot and sometimes just show up and talk about things. Sometimes that’s really good, and sometimes you could tell we’re probably just showing up and talking about things.

Robert de Neufville: But it started about five years ago. Another couple of Supers that I knew through the Good Judgment Project tournament just proposed doing it. It’s kind of like a forecasting variety hour. We talk a lot about politics. We also talk about forecasting research sometimes. We do interviews. We have been forecasting the Pantone Color of the Year for years. I don’t know if anyone cares about fashion. I don’t know if all of our listeners are that into it. But we really enjoy trying to figure out what this year’s color is going to be. And I’ve been right several times, although last year was a complete disaster.

Rob Wiblin: It’s discrediting to get that wrong.

Robert de Neufville: Well to be fair, I don’t know how many different colors you want to say there are, so that’s a big pool to choose from. And last year they went off the reservation and chose a color they hadn’t indicated, they’d given no hint of before. But we should have known they might do that. But basically it’s sort of a forecasting variety thing, and we have a lot of fun doing it.

Rob Wiblin: Yeah. Clay?

Clay Graubard: So then at Global Guessing we have two podcasts. One that used to be weekly, but due to various commitments it has for the time being been moved to intermittent. In the near future, we’ll come back to weekly. And that’s interviewing people in the forecasting and geopolitical space about their projects, their view on forecasting.

Clay Graubard: But I think the more interesting podcast that we do that we’re going to be bringing back shortly is The Right Side of Maybe, where we talk to really good forecasters about three or four past predictions that they’ve done, what they learned from it, how they approach their forecasts, what they got right, what they got wrong, as well as what their general approach is to forecasting. I was talking about how I’ll spend eight hours a day on a single forecasting question, and some people will forecast 15 questions in 30 minutes. There is a very wide range of approaches that people take to doing forecasting well, and you can learn a lot by talking to people that do it differently. It’s always great to talk about the forecasts that people got right. But it’s also the ones that they got wrong, where you can learn a lot of really interesting things.

Rob Wiblin: Absolutely.

Clay Graubard: You know, when they weren’t able to break away from the community — even though in their hearts, in their forecasts, they probably should be doing that. Which can be very difficult, right? If Metaculus says it’s 10% and you’re thinking it’s 80%, it can be difficult to break from the crowd, because the crowd generally gets it correct. So talking to forecasters also about the lessons that they got from their false forecasts as well.

Rob Wiblin: Yeah. Something I really like about the forecasting scene is people love to say “I got X wrong. I was way off base about that, and here’s what I think went wrong. And here’s what I’m going to try to correct for next time.” It’s one of the main ways you can actually get good at this over time.

Clay Graubard: Yeah. When Robert was talking about how [Pantone] picked a color that wasn’t even in the set. You think, “Well that’s not a very applicable forecasting error” — I think that happens quite a lot of the time. One of the forecasts that we got very wrong on Global Guessing was the spread of the alpha variant of COVID, because we didn’t factor in the possibility that maybe there was another variant of concern in the United States that would out-compete it and slow its growth relative to how it grew in the UK or in France. And so sometimes you have to realize that you don’t know.

Rob Wiblin: You might not even have the right answer in your —

Clay Graubard: You might just be forgetting about a piece of the puzzle that is integral. And it’s always good to keep that in mind because that is actually quite a common error.

The evolution of the forecasting space [00:15:56]

Rob Wiblin: Absolutely. Robert, it sounded like you agreed with me that there’s a lot of movement, or a lot of activity, a lot of buzz, around forecasting at the moment. What are some of the signs of that? What’s going on?

Robert de Neufville: Well, one of the big things was the FTX Foundation announced they were going to donate a lot of money. And a number of the projects they wanted to do involve forecasting, or specifically Superforecasting, kind of judgmental stuff. So there was a big scramble to apply for this. I feel like almost everyone I knew who was in forecasting was interested in some of this bonanza. I think it’s a really good thing.

Robert de Neufville: Clay, I’m more like you: I like to spend a lot of time forecasting questions. There are some things I’ve been forecasting, like whether the Fed is going to raise rates — that doesn’t take me very long because I’ve done it a bunch of times and you know the sources to look at and everything. But typically I feel like I could do a lot better forecasting if I spent more time on single questions. Sometimes you can get maybe 80% of the way to a forecast with relatively little effort. But really, something like the war in Ukraine, you could spend eight or nine hours. You could be an analyst and that could be your entire job. And maybe it wouldn’t help your forecast, but you might discover something you missed if you spent that time.

Robert de Neufville: So I have often felt, doing this in my spare time or in a small amount of time, that I’m not doing a great job. My scores have been pretty good, but I could be doing a lot better. I’ve screwed something up because I failed to update it. I didn’t watch the news closely enough or I didn’t read something. And so I’m conscious of wishing that I had more time. And this money I think will make it possible for some people to make it a job, to actually devote the time they should to it.

Robert de Neufville: I think one of the things that organizations like Good Judgment have tried to do — and maybe this isn’t fair — is do forecasting a little bit on the cheap: will get a lot of volunteers and give them Amazon gift cards or small honoraria, and we’ll get the wisdom of the crowds effect that catches a lot of mistakes. And that works to a certain extent, but I feel like sometimes our forecasts are slow to update because people have jobs, they work 50 hours a week doing something else, they have three kids, whatever it is. So I think it’s a good thing to actually give people the time to spend on it by paying a living wage.

Rob Wiblin: Definitely, yeah.

Clay Graubard: And just on that point, Andrew and I have calculated how many startups there are in the forecasting and prediction market space. And there was a lot of activity, once the IARPA tournament started in 2011-2012, that went up until 2016-2017. I think there were 26 new startups in one year. Then for the years afterwards, it was extraordinarily low. And we’ve seen a lot of new companies go into the forecasting space: a ton in the crypto prediction market space, but then a lot of just traditional forecasting companies as well. And again, a lot of money has been coming into the space — not only through new companies, but existing companies seemingly are reinvesting in the space and trying new things that they haven’t been doing before, like redoing their front end or trying out new content that they weren’t doing before.

Clay Graubard: I think part of it is that geopolitics were kind of written off, and the importance of having good insight into the future of geopolitics was kind of written off in the past. And as the world changes, I think people are realizing that actually having a good sense on where the world is going is incredibly valuable. If you knew Russia’s invasion was going to happen, there was a lot as a business, as an investor, as an activist —

Rob Wiblin: As an individual Ukrainian…

Clay Graubard: — yeah, as an individual Ukrainian, that you could have done with that. So I think there’s also part of that as well.

Early predictions for the Russian invasion of Ukraine [00:19:49]

Rob Wiblin: All right. Well, let’s move on to the meat of today’s conversation, which is all of the forecasts that we’ve made, our attempt to get insight into the Russian invasion of Ukraine, the various different aspects of it, and I suppose the remaining questions and uncertainties that we have now.

Rob Wiblin: Winding back to January and February, I know both of you were thinking about this issue back then, when we knew a whole lot less than we do now. What kinds of predictions did you make, and how did they pan out? Maybe Robert first.

Robert de Neufville: Back in January, I didn’t make good predictions about this. I wasn’t following it very closely. And I think Atief asked me on our podcast — this is one of the guys I podcast with — if I thought there was going to be an invasion. And my answer was basically, “I don’t think it makes sense. So probably not.” And I think that’s a lot of reasons why people were wrong about it, because they looked at it, even good forecasters I know, and said, “It doesn’t make a lot of sense to do this. So probably not.” I think that’s kind of a forecasting error. Sometimes people in intelligence talk about mirror imaging, which is to assume that the other people whose behavior you are trying to forecast think and see things the way you do. I think a lot of people did that with Putin and with Russia in this case.

Robert de Neufville: I also want to push back a little bit, because I think that kind of thinking can be useful, right? You do go around making basic assumptions that people are trying to do things that are smart, and playing a certain game, and so on. And it’d be really hard to forecast anything if you imagine people were just capable of doing random things at any given time.

Robert de Neufville: But I think a lot of good forecasters looked at it and said, “This is a strategic blunder. This is counterproductive. It’s not that likely.” And that was my initial reaction. Later, Russia kind of started to invade even before they crossed the lines. They were doing all the things you would want to do if you were really seriously going to invade. I guess that’s how you want to do a bluff too, but it was a really good bluff if that’s what it was. They were getting blood supplies and changing conscript rules and all sorts of little minor details.

Robert de Neufville: At some point I think all that stuff happening made me recognize this is really probably going to happen, even though I still had some doubts, which I’ve talked to Clay about. There was also an argument that Putin has been indicating, tipping his hands and talking about his ideology for years, and I think people haven’t been taking him very seriously. I didn’t follow that very closely initially, but in February I started to look at that and see this is actually something that he may really do. But I wish I could say I had done a better job earlier. Clay was the one who was on that.

Rob Wiblin: Yeah. Clay, how did you do?

Clay Graubard: By around January 1, I was 62% likelihood of an invasion. At the beginning of the month it rose quite precipitously as the invasion and the steps that Russia was taking were escalating. And then I decreased over the month of January, as you saw substantive Russia-US talks about European security, as well as an effort to revive the Normandy Format and have France and Germany play a role in getting Minsk II across the line with Zelenskyy and Putin.

Clay Graubard: By February those prospects had died. I was actually quite surprised at the degree to which the US actually took Russia’s concerns initially seriously, but nothing really ever came out of that. And actually the offer that Russia made, I don’t think it was actually seriously enough considered. But that’s probably another topic for another day. Suffice to say that at the start of February, I was at 88% chance that Russia was going to invade Ukraine. There were slight downs, but by February 12 I was 90-plus. And then from there, I think I hit 99-100% on February 20 or 21.

Rob Wiblin: And they invaded on the 24th, right?

Clay Graubard: Yes. Correct.

Rob Wiblin: So you were at 99% a few days before they did it?

Clay Graubard: Yeah. And I was struggling for quite a few days. We’re talking about getting rid of one world. Or, if we can say it’s 1,000 worlds, we’re getting rid of 10, but you’re making that very fine judgment. And that was really difficult. I don’t even know if it was worth the time, right? Is it worth 40 hours to go from 96 to 99? I don’t know. I did it. So yeah. But that was sort of the path.

Rob Wiblin: That’s super impressive. I’m curious to know, what do you think that you were seeing that a lot of other people were not? My impression is that in general, people were pretty divided on this, and many people were just agnostic because it was hard to tell. My view was that we weren’t going to know whether it was a bluff until they basically crossed the border, because a bluff and a real threat necessarily have to look the same.

Clay Graubard: I think a big thing that a lot of people had wrong — whether or not they were right or wrong, and we don’t know that without a very large track record, but I think you can make a case if you were 33% chance two weeks before it happened, that that was probably an incorrect forecast — is that people didn’t really have a theory of the Ukrainian conflict, of the Russia-NATO conflict. What the parties wanted and what they were each willing to spend in order to achieve their objectives. And we’ll talk about this later, but one thing that I’ve been surprised about is the US’s degree to involve itself in this conflict, particularly given the actions that the country took before the invasion of Ukraine. That has surprised me.

Clay Graubard: So I think not having a good model of what was happening, and therefore what are the signals that you were looking out for. I was looking for any sign that this is Putin bluffing, versus here is the situation that’s going on between Russia and Ukraine, how it’s been evolving since 2014-2015. Whether it’s in regards to Ukrainian domestic sentiment on having this sort of new Warsaw Pact versus joining EU/NATO. Whether that’s the shipment and training of Western arms, whether it’s sanctions on Russia, et cetera, et cetera.

Clay Graubard: Is Putin still bluffing, or is he going to take advantage of the post-COVID high-inflation West? There were all those reports about the gas prices being really high already. So he has that very good leverage point right there. So was that a bluff? Was Zelenskyy going to agree to Minsk II? Which would’ve granted political autonomy within Ukraine to the LPR and the DPR, which would’ve given those Republics de facto veto over Ukraine joining NATO and the EU. And would the US have made some sort of security concessions within Europe? I don’t think that would’ve been immediately necessary for an invasion happening, but it is part of the larger conflict, which I think Russia-Ukraine sits in.

Clay Graubard: So those were the things that I was looking for. As well as, was the US going to credibly commit itself to involving itself in the conflict? I was just trying to see how are those questions resolving? So I had a model to guide me through a very high-signal environment, which I think is really useful to not get a lot of noise, because there was just so much information. And then I think forecasters just didn’t have that — which, again: time, right?

Rob Wiblin: Yeah. It sounds like you just had a lot more granular domain knowledge or knowledge about the specific situations and the actors and their views and their values and so on. And that allowed you to pick up on stuff that most people were missing when they were looking at it at a very high-level, abstract point of view.

Clay Graubard: And I also think having high-level theory, right? The point that Robert was making earlier about this not being a rational choice for Putin. I told this to friends and I have told this to other forecasters privately before the invasion, but if I was Putin, knowing what I knew then, I don’t think it would’ve been irrational for me to invade Ukraine.

Clay Graubard: Now, I think that that ended up being wrong because Putin made the mistake of thinking that Ukraine and NATO, that there was this separation — whereas actually the US was willing to go to war in a sense, should Russia invade Ukraine, which I don’t think was the message beforehand. But barring that, I think given Putin’s context in Russia, that it actually wasn’t irrational for him to invade.

Rob Wiblin: It wasn’t so obviously contrary to his interests, as he perceives it? Interesting.

The performance of the Russian military [00:28:32]

Rob Wiblin: OK, let’s turn to this next aspect of the question. I think a lot of people, including me, had the wrong idea about how things were going to play out, which is expecting the Russian military to perform far better and to accomplish their goals much better than they actually did. Did you think very much about how likely Russia was to be able to take cities like Kyiv, or be able to politically dominate Ukraine after the invasion? Rob?

Robert de Neufville: Yeah, I thought some about it. I was surprised. I think virtually everyone has been surprised that Russia hasn’t done better, but I didn’t necessarily expect Russia to be world-beaters. I guess I would’ve said that I thought their military was a little bit overrated, but also I think people underestimated how capable the Ukrainian military was. This is a large army, they’ve built it up. They’ve been modernizing it since the last Russian invasion, if not before. And they are a veteran group that has been fighting peer forces on their own territory for a long time.

Robert de Neufville: Probably in hindsight, I would’ve also thought more about morale. You have Russian soldiers who don’t even know they’re going to war initially, much less really have a conviction that it’s a really important thing. And on the other side, you have Ukrainians who are fighting to defend their home. And I think that is very important.

Robert de Neufville: So I thought Ukraine would hold out longer than some people did, but I didn’t see this coming. The US government supposedly estimated that Kyiv was going to fall within four days, and that was one of the longer amounts of times that people had estimated. I don’t know if I thought that was likely, because urban warfare is difficult and Ukraine would’ve had to really collapse, but I did not think that Kyiv would hold out indefinitely. And at this point it doesn’t seem like it’s about to fall at all. So that surprised me. I think we could have figured that out a little bit better for sure. But the experts whose information I was looking at for this mostly had the consensus that Russia would perform better, and I didn’t have a lot of outside knowledge to evaluate that claim, so I mostly believed it.

Rob Wiblin: Yeah. My perception is that during the lead-up in January and February, the question of whether Russia would invade was sucking up so much oxygen that there wasn’t as much thought going into this question of, “How would the invasion play out?” I saw a bit of writing about that in the week or two ahead of the actual invasion, but it seems like many people, like me, just kind of assumed that Russia was going to be very dominant in this fight. So it’s a question that was asked, but a question that wasn’t given that much attention.

Robert de Neufville: Yeah. I think that’s right. Russia’s military has a reputation for being the second-best military in the world, and the assumption was that they would just win this. I also should point out that I think US intelligence has made a difference. We haven’t talked about it that much, but the US has been feeding Ukraine information about when they’re going to be hit, about where targets are. I don’t want to undersell the incredible Ukrainian performance, but I think that has also been a big factor that maybe we didn’t think about — or I didn’t think about enough, certainly — before the war started.

Rob Wiblin: Yeah. Clay, how about you? You were spending tons of time thinking about this. I imagine you might have had a few more hours in there to think about what was going to happen after Russia tried to invade.

Clay Graubard: Yeah. It’s such a great question. And it’s such a difficult one to answer. I think it’s definitely true that Russia has not performed as everyone would expect it to fight, all else equal. I do think that is true. And I also do think that the extent to which Ukraine is fighting is at least on the high end of the predictions that people were making before the war. A lot has been made about the changes undergone in the Ukrainian military — when it did horribly in 2014 and 2015 to the point where they had to agree to Minsk II, which was a terrible deal for Ukrainian sovereignty, and why it hasn’t been implemented. A lot of work has gone into the Ukrainian military. They’ve gotten a lot of training. They’ve gotten a lot of combat experience, fighting in the Donbas. They’ve received a lot of equipment as well. All of that was very influential. And I think all of that is true.

Clay Graubard: I think for me, I’m less surprised about how the Russian military has performed and rather the way in which Ukraine has been supported by the West. I think even the initial sanctions were more than what people were expecting, and they’ve only increased. Or whether it comes to the amount of military aid — whether that’s bullets, whether that’s anti-rockets, whether that’s now increasingly heavy equipment — or I think also very central is intelligence, right? We knew on day one that they were sharing signals intelligence and satellite intelligence with Ukraine. But there’s also been reports that there are US special forces on the ground giving intelligence, with the reports now that the US has been sharing intelligence to get high-value targets. That’s not something that you can just do if you don’t have anyone on the ground.

Clay Graubard: So you have that. You also have the US military doing wargames on how to defend against attack on Kyiv and really sort of working with the Ukrainian military. And I don’t think I had properly appreciated, first of all, to what extent the US military and government is going to be involved in the war in Ukraine; and then also, what is that impact going to be? Because the US is very good at figuring out logistics, is very good at getting intelligence and carrying out strikes. And I don’t think I factored that in.

Clay Graubard: On the other hand, I think the question of Russia has performed poorly: that’s relative to what? It’s been bad, but how bad has it been? It took NATO one month to get Baghdad, but Iraq’s half the size of Ukraine. It has a much smaller population and it’s much more concentrated. I think like 20-something percent of the population in Iraq lives in Baghdad, and 9% live in Kyiv. So it’s much more spread out, much more targets, much better military. In one day in Iraq, NATO launched 1,600 aircraft and fired 500 cruise missiles. It took Russia 10, 12 days to fire 500 cruise missiles. And if my memory of open source intelligence is correct, Russia only brought 300 aircraft to the border of Ukraine.

Clay Graubard: So it’s a very different military operation. And they’ve made a lot of gains in the south and the east. Yes, they’ve pulled back from the north, but they weren’t routed in the sense that on the way out, the convoys were just taken away. They mined a lot of the areas. So in the middle of a war, it’s very difficult to really get a sense on how these militaries are performing. And when I think the information is like, we only really see the Ukrainian POV of the war, which is very different than the lead-up, right? Before the war we only saw Russian military equipment moving to the border, training exercises that they were doing with Belarus. We saw nothing about how Ukraine was mobilizing or what they were doing. And now that the war has started, we see all of the Ukrainian victories, but we get very little about how the Russian military is performing.

Rob Wiblin: And how they perceive their own performance perhaps, or to what degree they’re accomplishing their own goals in their mind.

Clay Graubard: Yeah. And then also it is just kind of interesting that they don’t care to correct the narrative, or to set out their own narrative. So either they’re doing so bad that they couldn’t even spin it if they tried, or they like having the sort of, “You can set your own narrative and we’re just going to keep fighting the war.”

Clay Graubard: And so, how’s the Russian military faring? Ask in three months and sort of see how things have gone. Have they gotten all the way east to the Dnieper? Have they tried and successfully captured Odessa? Which I think is more valuable than Kyiv if we’re talking about the state power of Ukraine and also fighting a long-term war, because then Ukraine has no coast and Russia has the Sea of Azov and the Black Sea as well.

Robert de Neufville: I agree on one thing, which is that I think that they have performed better in the south and the east. I think there are reasons for that. That’s more the style of war that Russia’s prepared for. And I suspect they will continue to do better in that region than they did trying to take Kyiv. They’re an artillery-heavy army; it’ll be easier for them to handle logistics in that area. So I think we may be surprised that Russia performs a little bit better, or seems to in the future. I don’t know whether or not they’ll get Odessa. Well, I haven’t really thought very carefully about it, but I think that’s an open question.

Robert de Neufville: I also think though that I was not as surprised by the NATO response. I think the NATO response has been more than I expected. I think that some of that has to do with NATO leadership, and the Biden administration has done a pretty good job of rallying NATO. I don’t think the Russians expected that. But if I had been Putin’s advisor, I would’ve told him, “You go far beyond the Donbas, you mess around, you go to Kyiv, you will have a massive response from Europe.” That’s what I anticipated, and that’s part of the reason I thought that he might not do it.

Robert de Neufville: I think that Putin’s advisors were not telling him that, or Putin didn’t think that. And as Clay said, there were reasons why one might think NATO wouldn’t rally. NATO wasn’t very good at signaling some of that, but that was my expectation the whole time. And why I thought this is just a crazy thing to do. I mean, this has revitalized NATO as an organization, and made people in Europe think a lot more about Europe as an institution as well, which is the exact opposite of what he would’ve wanted.

Rob Wiblin: Yeah.

Clay Graubard: What I do think on that is also, to what degree is the successful push in the east and the south due to the fact that the Ukrainian military had to be positioned in Kyiv at the start of the war? Because there was a massive presence there, so they couldn’t shift their forces to the Donbas to reinforce the troops as an objective.

Robert de Neufville: I think that’s right. I mean, I think that you could say that there was some value to the attack on Kyiv. They attempt to take Kyiv as a feint, to draw Ukrainian forces off, and that bought them some success elsewhere. But it was a costly feint. I don’t think that was the intention. I think that you wouldn’t have done that just to try to gain in the south and east. But yeah, some of their success probably has to do with threatening other areas outside of the southeast.

General lessons [00:39:21]

Rob Wiblin: Are there any general lessons that maybe we should keep in mind from this experience, where most people had made this assumption that Russia was going to be substantially more successful militarily than it was? Maybe Clay, you go first.

Clay Graubard: Yeah. I think important to that is defining what even is success. Like before Russia invades, what is success? And what is the likelihood of that success? For instance, the whole “Kyiv will fall within two days” — is there any chance that that’s possible just through pure shock and awe? Or was the hope that if Putin signaled that he is actually going to invade, that Zelenskyy would immediately sue for peace, because he didn’t want to fight a war? And then once that didn’t happen, that whole potential possibility of the war being over within two days or two weeks should just be sort of thrown out the window.

Clay Graubard: Again, it also comes back to the Color of the Year, thinking about things like the US military actually being surprisingly involved within the war to help support the Ukrainian military. When you’re being very confident in a forecast, you really need to think that you have all aspects of this considered, and that you’ve considered ideas that don’t even make any sense at first — that you’ve at least gone through a process, just because you’re talking about getting rid of all of the worlds out there. And so, always be wary about being very confident about things.

Rob Wiblin: You use this term “worlds” to describe different possible scenarios, different ways that the world could play out. Is that how you think about this all the time? You’ve got a portfolio of a thousand worlds, different possible scenarios, and you want to say each of these is equally plausible and then kind of count them up?

Clay Graubard: I think it depends on the forecast. Especially if I’m at like 60%, I’m not like, “Oh, what do all those 40 worlds look like?” It’s especially when I’m at like 2% or 5% or like 92% where that really starts becoming useful. And then also what is the nature of the forecast? Like there was a question on my Metaculus: Will the US default on its debt? The base rate on that is so low of happening. Just the reasoning of, even if political squabbles are so high, is Congress going to let the US default into that, and have anyone responsible for the calamity that that would bring to the global economy? No. I don’t have to think about worlds. I can just make a forecast.

Clay Graubard: But when it comes to something like Russia-Ukraine and the invasion of that and you’re spending so much time, I do think it is a very useful exercise to really think, “All right, what are the worlds that I can even imagine happening?” And then, which ones are still surviving? Especially when you’re going from five to four to three.

Rob Wiblin: Yeah. Robert, how about you? What will you keep in mind next time a country invades another country?

Robert de Neufville: Well, it’s interesting. The research I kind of wish I had done was look at some of the databases about wars and invasions, because you can establish some kind of a base rate. But it’s difficult, because every situation has its own specific characteristics. And there are a lot of similarities with Russia’s strategy here, going back to Georgia and other places where it’s intervened or invaded, but Ukraine’s scale and other factors make it very different from previous conflicts. So it is always difficult.

Robert de Neufville: I agree with Clay that you need to be very careful if you’re highly confident of something. If you think there’s a 98% chance of something happening, you have to really rule out a lot of possibilities, and 2% — I mean, it depends on the kind of thing you’re forecasting — but that almost just can be accounted for by unknown unknowns: this idea that there’s stuff out there that you haven’t thought of and probably will never be able to think of.

Robert de Neufville: Forecasting something as simple as whether or not the Fed will raise rates in the US, it seems very, very likely. But you have to ask yourself the question: would there be some weird international crisis that will change the economic situation, or make things change, something that you just can’t even anticipate? At some point I decided that the invasion of Ukraine wouldn’t change my Fed forecast, but that’s the kind of thing that I wouldn’t have thought about six months in advance. Something like that can always happen and potentially radically change the board.

Robert de Neufville: So I don’t want to criticize Clay’s forecast of the Ukraine invasion at all, because he was much more right than I was. And that’s evidence that his theory was right. But when I hear that he had 96, 99% chance, I think, “What about that stuff you haven’t thought of?” You know there’s stuff you haven’t thought of. Scenarios you can’t really rule out in advance. So that’s always my concern when you get to that level of certainty.

Robert de Neufville: And honestly, I have made that mistake with COVID a bunch of times, where I was pretty sure I knew what was going to happen. And particularly with new waves, new variants, they have continually surprised me. I tried to look at the base rate of variants and how they happen in the past in other diseases. And I’ve been repeatedly overconfident about what was going to happen, because I just hadn’t thought of the thing that ended up happening in advance. So I think that’s always a danger, that kind of level of confidence.

Rob Wiblin: Yeah. Clay?

Clay Graubard: And building on what you just said, using base rates I think were very influential, especially when we forecasted Russia. Russia potentially reinvading the Donbas in 2011 was something that we very much relied on. And then as we spent more time on it, sort of more shifting into the inside view.

Clay Graubard: But also just being aware of when there’s a discontinuity with the past. In the Cold War, the world was described as being bipolar. Perhaps once we’re in the ‘80s, it’s more tripolar with China as well. And then with the fall of the Soviet Union, everyone declares that there is this unipolar moment, where the United States is the sole superpower in differing degrees, depending on the scholar that you read. They sort of can establish their own rules and norms and all that sort of stuff. I think it’s very clear that now, post invasion, we’re definitely more so in a multipolar world.

Clay Graubard: Then when you’re trying to forecast an invasion, are you going to factor in the 20 years of the unipolar moment? Or do you sort of blend in that actually, in multipolar worlds, invasions are more likely — and so my base rate should be some percentage higher than it was for the last 20 years, because there’s been this discontinuity in the international structure.

Clay Graubard: I would also say it’s very useful to curate sources and information that disagree with each other and have very different points of view, and to really understand what they think. Because first of all, you’ll be aware of the one big thing that is driving their forecast. Even if that’s not the sole driver, it’s still going to be useful to your own forecast.

Clay Graubard: So the importance of Minsk II was something that I was made aware of and really researched following this one journalist, who said that he wouldn’t even talk about the word “invasion” in December, but he was very bearish on it. Not only did I learn something from him, but tracking his forecast, he went to not even talking about invasion, to saying it’s going to be a low-probability event, to then saying there’s a small chance. And following his progression, given his worldview, was almost as important as following Michael Kofman — who in April of 2011 said, “Yeah, maybe there could be an invasion of the Donbas, but probably not,” and then December 2011 is like, “This buildup that’s happening now is different, and I am very concerned.” And both of those can be very useful for a forecaster, and you can get a lot of valuable information even out of someone who’s not getting it right.

Rob Wiblin: Just returning to this point that it seems like the model that you had of this was better than a lot of people had, but the forecasts of like 98% seem kind of overconfident, just given the inherent uncertainty of the situation. How can you justify having such an extremely high probability of invasion ahead of it?

Clay Graubard: So 98% would’ve been like five, six days beforehand, and that was sort of the shift into like, “Are we in pre-mobilization, or are we actually in active mobilization?” It was just, again, coming down to, “Is there a last-ditch effort for Zelenskyy to unilaterally implement Minsk in Ukraine?” Which was more or less asking, “Is Zelenskyy willing to be kicked out of office?” — because no one in Ukraine would’ve supported that policy.

Robert de Neufville: So I’m curious whether you think that would’ve stopped the war? Because I don’t know that it would have.

Clay Graubard: Right. That would’ve achieved a lot of Russia’s objectives. Because granting political autonomy to the LPR and the DPR would’ve prevented Ukraine from ever joining NATO or from joining the EU, if you understand Minsk as it was in 2015. I know in like 2018, some Western legal scholars have actually said that the political autonomy wouldn’t have granted those republics a veto over Ukrainian foreign policy. But if it were in that sense, that’s a very large objective besides getting a land bridge to Crimea or getting some of the territory in the east. So that would’ve achieved a ton of Russia’s objectives in the conflict, while also getting Zelenskyy out of office because it was really politically unpopular, which would’ve given a great opportunity to get a pro-Russian figure into Ukraine as well.

Clay Graubard: Would it have stopped an invasion? Maybe not. We’re dealing with system effects, right? We’re now relying on information based on the actions that didn’t happen. So I can’t be certain, but would the invasion have happened on February 24, had Zelenskyy implemented Minsk? I would say very likely not that it would’ve happened there.

Clay Graubard: So again, Minsk, the bluffing, what was the US going to do in the situation? By then it was like, “Putin is pulling away his forces. Oh, actually he’s increasing the forces. Now they’re getting into forward camp positions.” So they were moving out of staging grounds into small tactical units on the field. You cannot maintain that force posture for more than like a week. So either Putin is going to reveal that he’s bluffing right here and right now, or he is going to invade. Otherwise he’s escalating, and then he is just going to deescalate, which would hurt his position.

Clay Graubard: So enough factors where the 2% is, “Well, maybe there’s some sort of back deal talks going on between Kyiv and Moscow, and they’ll reach an agreement before it starts.” But other than that, all the sort of gears of war were moving. I mean, I think it was World War I where between the order for Russia to mobilize and when it started was three days. And if you look at when Putin recorded his invasion video and when it went public, I think that was again like two or three days.

What Robert and Clay were reading back in February [00:50:32]

Rob Wiblin: Yeah. OK, let’s push on a little bit. I’m curious to know what sources of information you were reading in February, beyond the ones that would be obvious to people, that many people would’ve been reading. Maybe Robert, could you go first?

Robert de Neufville: Yeah. I don’t know if I have a really great list of esoteric sources on this, but I really valued Michael Kofman — his analysis was very useful. I had a dark period where I read some Eurasianist fascist literature, basically some of the ideological justification for some of this. I like reading Ukrainian and Russian sources on some of this, at least the ones that are in English or that I can get translated.

Robert de Neufville: One thing that I thought pushed back on the idea that there would be an invasion was that a lot of thoughtful Russians and Ukrainians didn’t think the invasion was going to happen. And there may have been reasons why they were wrong, but at the same time, there was a part of me that thought, “Well, they’ve been through some of these cycles of threats with Russia before, and maybe they recognize that they are bluffs in a way that it’s not obvious to me.” So I took that to some extent. So maybe I shouldn’t say that’s a good source, because that turned out to lead me astray a little bit.

Rob Wiblin: Well it sounds very reasonable ex ante. Even a good source will lead you astray from time to time.

Robert de Neufville: I think it’s worth doing.

Rob Wiblin: Yeah. Clay?

Clay Graubard: By February I already read Putin’s speech on Ukraine and a lot of the background documents, like what were the treaties that Russia had proposed to the US, NATO and all that sort of stuff. So by February it was mostly Twitter lists, curating a very wide range from commentators to analysts to open source intelligence, to what is the Moscow Times saying, what is RT saying, what is the Kyiv Independent saying? You know, what is the sort of mainstream propaganda perspective of each side. Kind of going overboard on my Twitter lists. It got so large that every single time you would refresh, it would just show more tweets. It got to be so much, so I made like three Twitter lists, and then on my phone I would swipe from one to the next, and then go through that again.

Clay Graubard: So it was a lot of just trying to find the best information on Twitter and then leveraging — whether it’s the Twitter app on my phone, or TweetDeck on my computer — to really follow those streams of information, like a wannabe CIA analyst. And just staring into the void of information and basically being like a bad machine learning algorithm, like, what do I take from this one piece of open source intelligence? Then making a judgment on the 20 pieces of information — what’s signal and what’s noise? And now that I have all of this, have I moved from 93 to 94 or down to 92? What’s going on?

Rob Wiblin: Yeah. I have a kind of intuition that that doesn’t seem like the most respectable or the most prestigious way of doing this sort of research. It seems a little bit odd that a really good way of making these predictions would just be having tons of tweets thrown at you, without a lot of quality control necessarily. Would you push back and say actually this is a good way of doing it?

Clay Graubard: I mean, you need a ton of quality control. If you look at the rotating cast of characters who were inside of the lists — and when people got strikes and I kicked them out — it was very difficult to make sure that I was getting good sources. And obviously one bad piece of information in such a high-frequency environment wouldn’t be disqualifying. So just trying to keep a running track record of the value of each of these accounts and knowing what I’m trying to get out of them.

Clay Graubard: But I would say it’s almost more useful to get that sort of Twitter stream, and you can get more information than reading a single piece in Foreign Affairs or in Foreign Policy, or in Task & Purpose, or somewhere else. And not to say that I didn’t read those, but you read four or five of those every single day. But there was a ton of information that you would get on Twitter that US intel will then talk about it like a week later, or the mainstream media would talk about it hopefully three days later, but sometimes far later. And so if you wanted that edge and you were willing to no-life the question, like I was, it was a worthwhile tradeoff.

Rob Wiblin: That was the way to go.

Robert de Neufville: I’d like to second that. I think Twitter has been amazing. And to some extent it’s a resource that we wouldn’t have had even a few years ago between local reporting and open source intelligence. It’s been really valuable for someone trying to follow this question. It is also kind of insane to get this firehose and doomscroll through it, which is also what I was doing.

Robert de Neufville: But I would like to add though that one of the most important skills I think for a good forecaster is being able to identify good sources from bad. And sometimes the source will be sometimes good and sometimes bad, but you have to be pretty skeptical and then filter through it, and develop your own kind of algorithm for what to pay attention to and what not to. That requires a lot of judgment that maybe we could teach, but it’s sort of hard to actually say exactly how to do it. So I think that’s a really important skill — just filtering that information is a big part of accurate forecasting.

Clay Graubard: I will also say I do think Twitter now, post-invasion, is not as good of an information source as it was pre-invasion. Mostly it’s a lot of complex things here. Again, the open source intelligence community was very good about reporting on Russian troop movements. I think if we were looking at what’s a more important signal that we want to consider — is it whether or not Ukraine is mobilizing, or if Russia is continuing their push? — I think Russia’s actions were more important, and you could still get signals that Ukraine was making actions. There was an article in the FT from like February 17 that talked about some mobilization being done in villages. And you can piece together enough nuggets, but I think the information was primarily about the aggressive actions that Russia was going to take. And Twitter was very good at communicating that information.

Clay Graubard: But then post-invasion, I remember when the invasion started, a lot of the OSINT guys talked about how they weren’t going to post about Ukrainian movements, like what was happening on their side of the military. And clearly Ukraine has won the information war when it comes to at least Western social media. I don’t know what’s going on on Telegram — I don’t really use it, and I imagine it’s a different picture there — but on Twitter, I think the Ukrainian information side is winning. So it makes it difficult now to get a very clear picture on what is going on.

Clay Graubard: And again, I think knowing when an information source is more trustworthy — like I think all else equal, you should have trusted US intel more on Russia-Ukraine than you would on the pullout of Afghanistan, just given where US interests and incentives were about relaying accurate information. Like, it would’ve been horrible if, while we were pulling out, you said, blink and come up on stage, “Yeah, they’re going to collapse in two weeks.” Like, as we do it.

Rob Wiblin: Next Wednesday, yeah.

Clay Graubard: That wouldn’t serve US interests. Whereas being very upfront about Russia invading Ukraine would be. So I do think Twitter is still very valuable. I still rely on my lists, but it’s becoming less valuable.

Rob Wiblin: Take a little bit more of a pinch of salt now.

Clay Graubard: But then that’s just war, right? Information is a very critical aspect of war. And now that we’re in it, for me, I’m a little bit more sticky in my beliefs. That’s not to say that I’m no longer updating and I’m just a very sad hedgehog who’s going to get everything wrong. But I was definitely much more flexible and willing to update beforehand, because I think I had a better trust of the information and the picture that that was giving to me.

Clay Graubard: And just as an interesting exercise, the Russian military doing bad is a very big theme right now. So I went back and I read New York Times articles about the First Chechen War, in which ultimately they captured Grozny, the Russian military. And a lot of what’s being said about the Russian military now was being said about it then. And you know, part of that is narratives. I think also part of that is mixing up tactical developments, like what’s happening in this individual battle versus what is the larger strategic picture? But yeah, I will say that right now information is I think difficult.

Clay Graubard: Robert, I’d be curious, have you found a source now that’s particularly good, that you’ve been relying on?

Robert de Neufville: No. I mean, there are some that I like, but I think I basically agree with your point that in the lead-up to the war, it was this Wild West information firehose. And a lot of the stuff we were interested in was not being protected by open source intelligence guys. And I’m glad they’re not reporting on Ukrainian troop movements, but I would like to know in doing my forecasts. I want to say that a lot of the sources have been somewhat sanitized.

Robert de Neufville: We’re getting a lot of US analysts and reporters who were talking about the defense briefing of the day, and Ukraine is winning the information war in social media. I think if we were seeing what was going on in Russian social media, or Russian media, it would be very different. But I don’t look at that stuff very much. And I don’t think that’s very informative for what we’re doing. So I do feel like we’re not getting the same level of straight, uncut information from the field that we were initially.

Clay Graubard: Which makes it difficult. Because one of the things about forecasting is you really have to just look at the world as it is. Not that you want to. And as we talk about like nuclear risk forecasting, you don’t want to look at the world as it is, because it can be depressing.

Rob Wiblin: It’s horrible.

Clay Graubard: And like staring into the abyss. Yet in order to forecast, and what makes it so exhausting, you have to separate what you want the world to look like from actually how the world is looking. And that’s just harder now than I think it was before.

Risk of use of nuclear weapons [01:00:45]

Rob Wiblin: Yeah. Let’s talk about the assessments of the probability of the use of nuclear weapons as a result of all this. I think it’s actually our shared interest in that that got us put in contact in March, and caused us to think about producing this episode. Just to rewind mentally for listeners, Russia invaded around the 24th of February, and for the next couple of weeks, all three of us and many other people were frantically trying to figure out what was the probability of escalation to direct conflict between Russia and NATO, and what was the risk of the use of weapons of mass destruction — either tactical nuclear weapons on the battlefield in Ukraine or the possibility of some miscalculation or some sort of event leading to an escalation to the use of strategic nuclear weapons by NATO or Russia.

Rob Wiblin: And this concerned us all, not only as a forecasting exercise, but also potentially because it could affect us personally obviously. If those weapons were used, then cities in which we’re living could potentially be targeted. So it was very much at the forefront of my mind, possibly more than it should have been, but it does feel like an important question. Maybe one by one, could you talk about how you approached trying to answer this question? I guess here we were dealing with lower-probability events. I don’t think any of us thought that this was more likely than not or anything like that; it was closer to 1% or 0.1%. So it makes it a slightly different exercise. Clay, could you go first?

Clay Graubard: Yeah. So, I think the first way I try to think about it is like, what is the interesting forecasting question here? On a lot of platforms it’s like, “Will a nuclear weapon device be detonated by August 27?” or “Will it be done by 2023 or 2024?” I think that’s a very interesting tactical forecasting question — though I think that relies on a lot of inside information about how processes are working, under what timescale do things of escalation matter.

Clay Graubard: And when I try to think of what is the best way to use my time forecasting, especially something that’s dependent on so many future steps… Famous last words, but sub-1% we’re just going to go from today to Russia launching a nuclear weapon at Kyiv or London or anything like that. There’s a lot of steps along that way, and when those could possibly happen. So we get into this realm of system effects, where I just don’t think it’s really within the reasonable realm of the forecasting to really figure out that time component. It makes it really difficult.

Clay Graubard: So for me, I view it as, “What is the likelihood that Russia uses a nuclear weapon before either a peace deal gets signed or Kyiv falls?” — so this conflict turns into an insurgency or it comes to an end. That was the original way I looked at it. Given what’s happened on the battlefield, maybe it should be thought of a little bit differently. And sort of approaching that is, “What are the escalation ladders that lead to nuclear weapons being used?” One way that I thought of it happening is it starts off with Russia conducting an atmospheric nuclear test that doesn’t have any sort of direct impact.

Clay Graubard: Although I do think psychologically that could do something to Western markets. We’ve seen the Russian military likes to shoot footage that looks like it’s part of an IMAX movie, when they were taking away their tanks from the border. They shot it like it’s Steven Spielberg. So something like that, a 4K nuclear explosion and put that on social media, that could hurt and spook markets and that could be a way that Russia gets back at us. And then we react poorly to that and that gets us up an escalation ladder, whether it’s Russia attacks Western aid shipments — especially as the West increases the amount of aid that they do — although, how long the West can continue sending the aid that we are, I think is a fair question. I think we’ve used a third of our javelin stockpile in the US.

Clay Graubard: But anyway, Russia attacking some sort of NATO shipment or then doing some sort of tactical nuclear strike within Ukraine as a show of force. I do think there’s something fair to say that the image of Russia right now is not one that I think Putin or the Russian elite likes, in terms of their military capabilities, and using a nuclear weapon could be a way to sort of shake heads.

Rob Wiblin: Flex their muscles.

Clay Graubard: Yeah. And just sort of trying to think about, “What is the likelihood for the West getting directly involved?” Because on the one hand, I don’t even think that’s something that we should necessarily rule out. I think there could actually be a lot of benefits from the West if we had a plan on how to manage the nuclear risks of getting directly involved, then defeating Russia. And if we’re talking about long-term nuclear risks, having a defeated, potentially reformed Russia would reduce long-term risk more than having Russia as an actor right now.

Clay Graubard: So thinking through those various scenarios. I was talking with Michał Dubrawski about how long is this conflict going to last, barring a sudden collapse of Russia or a significant threat to the sovereignty of the Ukrainian government — whether that’s civil fighting between the different factions within society, which were very fractured before the invasion, or some sort of saboteurs, or Zelenskyy gets assassinated. Barring a rapid collapse, I could see this conflict easily lasting a full year or longer.

Clay Graubard: Under that timeframe, I would probably put my number at 12.5% right now. I could be persuaded down to 8%. I could see myself going higher as well. It’s a very uncomfortable question, but there are a lot of avenues that lead to it. And in our initial set of forecasts on Russia-Ukraine, I sort of ended it really panicked about how we were just nonchalantly escalating the conflict. I do think the pace at which the West has escalated has cooled off a bit, but the general direction is still of greater Western involvement.

Clay Graubard: There’s been more talk now about Russia making moves in Transnistria, which would get Moldova involved. I am, for instance, worried about China and Taiwan and a whole other set of nuclear flashpoints that exist. Given the war that’s going on right now, if something were to happen between Iran and Israel, the battle lines that would be drawn out would be in part driven by Russia-Ukraine. So is it part of that conflict then too?

Rob Wiblin: So just to clarify, when you’re saying 8% to 12%, most of that is the use of a tactical nuclear weapon in the war in Ukraine, I’m imagining. Right?

Clay Graubard: At least happening first.

Rob Wiblin: Right. So that’s how it starts.

Clay Graubard: That starts using it there. It’s more likely to have the nuclear weapon be used in Ukraine, keep the conflict in Ukraine, and sort of force the West to escalate and bring it into a larger conflict than having their first nuclear strike be in Poland or something.

Rob Wiblin: Yeah.

Clay Graubard: Or using a strategic nuclear weapon on London. Because you have to realize none of these leaders want a full-out nuclear war. They live in this world, their kids live in this world. We can’t rule out that they want to do that. They very much don’t want to get to that place. Whereas tactical nuclear weapons is different. This is something that the US has spent a lot of money on in the ’50s and ’60s, something that Russia has kept part of their military doctrine since. So yeah, more on the tactical side. But of course, Russia uses a tactical nuclear weapon and then that’s used as a basis for a no-fly zone, and then you could easily see how that could lead to larger nuclear conflict as well.

Rob Wiblin: Yeah. Robert, how were you thinking about this in the first few weeks of the invasion?

Robert de Neufville: So before I was writing my forecasting, I worked for years at the Global Catastrophic Risk Institute, and one of the most plausible paths to a global catastrophe is a nuclear exchange. So we did some research on this. I looked particularly at close call incidents, just a variety of nuclear incidents. Some of them were pretty harrowing, but I actually came out of it thinking a lot of them were not really that close calls. But there’s a lot of potentially escalatory moments that you wouldn’t necessarily think of. For example, at one point during a war, Israel mistook a US research ship in the Mediterranean for an Egyptian destroyer and attacked it. They attacked a US ship, and the United States scrambled its available fighters without realizing initially that they were nuclear-armed fighters. So it scrambled nuclear-armed fighters to defend a US ship against Israel. I mean, this is a crazy kind of story. And they figured it out pretty quickly. They recalled the fighters. They talked to Israel and everything. But there are a bunch of these things where there were nuclear weapons in play in various places that you wouldn’t even think was a likely risk scenario.

Robert de Neufville: So a lot of my concern when the war started was that there are moments for these kinds of weird escalation mistakes, nuclear weapons in the wrong place. And this just creates so many more opportunities for things to go wrong — even though, as Clay says, nobody wants this. Russians love their children too. We all live in the same world. Nobody wants it destroyed. Well, maybe not nobody, but there aren’t a lot of death cults that have nuclear weapons. So just in general, I think this kind of friction with the US and Russia in close proximity on opposite sides raises the chances, to me, uncomfortably high.

Robert de Neufville: I also agree that the most likely scenario is some kind of tactical nuclear use in Ukraine. I initially thought that was reasonably likely. Now, having seen the way the war is going and having learned a little bit more about Russian nuclear policy, I think it would be pretty difficult even for Putin to actually make this happen. The danger, I suppose, is that if Russia is losing in a certain way, they might want to change the terms of the conflict or make a demonstration, change the scope. It doesn’t seem like a very good idea, but if you’re already losing, maybe you try a different way of losing, even though it might not be that rational.

Robert de Neufville: So that was my concern. As I say, I now think that is less likely than I initially thought. But there’s also a lot of pressure to escalate. We have already escalated in a number of ways — the way we’re supplying weapons and materiel to Ukraine — and NATO governments appear to be under a lot of public pressure to escalate more.

Robert de Neufville: I think that the Biden administration and most NATO leaders are pretty clear on not wanting to do a no-fly zone and other things that would potentially risk escalations, and the militaries are very aware of this kind of thing. But I worry if Russia commits some kind of atrocity or appears to have committed some kind of atrocity — I mean, they’ve already committed some atrocities, I think — but if there’s a chemical weapons attack, a clear chemical weapons attack, and there are a lot of bodies on TV, can NATO politicians resist this pressure? The Democrats in the US may be just creamed in the upcoming midterms. At some point, is there some pressure to look tough in the US Congress? These are kinds of things that potentially are riskier. And nobody wants to get to a nuclear war, but I think the small escalatory steps are a little bit scary and they increase the chances.

Robert de Neufville: I don’t know what timeframe we’re talking about 8% to 12% chance of nuclear use, but I do think there is some chance of a tactical nuclear use, maybe 1% or something. I don’t want to rule out too much because of the unknown unknowns we talked about. And the chance of an escalation today between Russia and US and NATO is a lot higher than it was last year because of this conflict. I don’t know exactly how high — there are some estimates and we could talk about that — but uncomfortably high. You really would prefer it to be lower.

Rob Wiblin: Yeah.

Clay Graubard: One thing that I also worry about when we talk about escalation to nuclear weapons use is just how would we respond to a step before tactical nuclear weapons — where if we overreacted, then Russia would use tactical nukes? For instance, I think one way to reduce that risk is to really set clear, defined lines on certain actions that would then trigger a response and then actually do that response and not do more than that response. I think the West has sort of hurt its ability to credibly signal what it will respond to. Because the sanctions that we talked about before the invasion were not the ones that we got immediately. Same when it came to weapons delivery. And that’s because public opinion changed. But if we’re trying to manage the nuclear risk, we can’t say, “If there’s a chemical attack, we’ll do X and not Y,” but then that happens and we do X, Y, and Z.

Clay Graubard: And I think Putin is probably pretty distrusting of the West as well because of what has happened, which I think just increases that larger risk, and smaller steps that we take now could lead to nuclear weapons use down the road. So if we start off supplying Ukraine with MiG fighters, that works great, but there’s a limited number of MiGs out there from NATO. So what happens when those MiGs run out? Either the Russian military has been defeated, or the fighting goes on, and then either we’re like, “OK, Ukraine, we gave you all the MiGs, but now you don’t have an air force. And we really tried but we’re giving up on you.” That would look horrible. I think that would also be a horrible moral thing to do, to support a country throughout all that and then just say, “Well, we ran out of stuff. So now we’re not going to support you.”

Clay Graubard: So then does the US supply F-22s? But where do they get trained for that? Do they do it in Poland? Are Polish pilots flying for Ukraine? And then how does Russia respond to that? And so even actions that we take now that don’t seem like they’ll lead to nuclear weapons could.

Clay Graubard: And will there be another massacre like Bucha? I think definitely. I think that should have been entirely expected, looking at how the US invasions of Afghanistan and Iraq went. When you have a general mobilization in the country where any male age 15 to 65 will fight, and they have the entire population who’s fighting, and you send a lot of soldiers in to an environment and at the start of the war, they just don’t know where they’re going and they’re getting slaughtered, that then leads to atrocities. War gets absolutely horrific. So if it’s just another massacre that will then lead to that aid, I think we should actually be at higher than 1% now, because the likelihood of that happening is relatively high.

Robert de Neufville: Yeah. I’d like to add something about signaling too. You’d like to say, “Here’s the red line. If you cross it, this is where we act.” That sounds appealing in some way, but that’s problematic, right? If we were to say, “Do this and that’s when we might launch nuclear weapons,” everyone would say, does that mean Russia can do everything up to that point? Are you going to come out and signal in advance that chemical weapons are OK? That would be incredibly unpopular in the US and elsewhere.

Robert de Neufville: And to some extent, the US and Russia, the nuclear powers, use the ambiguous threat of nuclear response to try to deter other things. And maybe they shouldn’t do that, but there are strong reasons why they feel they have to. It’s kind of a dangerous game. It’s like wrestling on a slippery slope or something. If you go too far, you might both slide down the slope, but each side is kind of using the threat that that might happen as leverage.

Rob Wiblin: Yeah. OK, let’s just press pause on that for a minute. I’m interested to rewind to those first few weeks. Robert, at some point you wrote a post on your Substack where you estimated the risk of use of nuclear weapons at something like 4%.

Robert de Neufville: I said 4%, yeah.

Rob Wiblin: And that was over the next couple of months, or I suppose while the conflict persisted. Can you explain what that number was referring to and how you arrived at that figure?

Robert de Neufville: Yeah. So that was mostly my fear that Russia would try to shake up the conflict, do a demonstration of a tactical nuclear weapon within Ukraine. I didn’t think they were going to launch against a NATO target or something like that. It was specifically meant to be used in war, in my forecast. That’s what I was forecasting, rather than an atmospheric test or something, which I think there’s an additional chance of. I think now that was high, now that I’ve seen the dynamics and the way we’re escalating. I think the escalation is alarming, but I don’t think that it’s that rapid or as dangerous as I initially thought.

Robert de Neufville: I also think that at the time I thought it would be easier for Putin, just in terms of ordering a nuclear use of a tactical weapon, than it in fact would be, having now looked at a little bit about how the procedures are. And there’s some debate over whether Russia has an escalate to deescalate doctrine — which essentially is you use nuclear weapons and be like, “That’s it. Now everyone has to stop. I’ve used the nuclear weapons.” Some people think that’s insane because when you escalate, the other side wants to escalate again rather than to stop. It’s not really clear whether Russia even has that doctrine — some people have written about it, but there’s a lot of ambiguity about it. In general, I think that Russia’s posture is such that it makes it more difficult than I might have originally thought to use tactical nuclear weapons, even though, as Clay says, they do have them.

Rob Wiblin: Yeah. What’s the thing that would prevent Putin from ordering that, or make it more challenging?

Robert de Neufville: Well, there are procedures. You have to go through a process. Both countries have this process where it’s not just snap your fingers. I guess there is some kind of a button in an emergency, but essentially there are a bunch of administrative steps and pushback points. I don’t have the details at my fingertips, but although we think of him as having absolute power — and in some ways he has a very large amount of power — there are, I think, real controls to make it somewhat problematic for him just to do. And I don’t think it makes much sense for him to do. Of course, I don’t want to rule out him doing something that I think doesn’t make sense, because he’s done that before.

Rob Wiblin: Yeah. Yeah.

Clay Graubard: Also China: are they currently game for Russia to use nuclear weapons? Before the invasion, they clearly signed off on the invasion. There’s in my mind virtually no doubt that Xi [approved] the invasion within Ukraine, and that after the fact we’re going to find out that they very much had the role that the Soviet Union had during the Korean War — that they’ve been supporting the Russian economy, the Russian war machine as well.

Clay Graubard: So is Xi OK with how the West and the world will respond to nuclear weapons use now, and will the effects of that be what they want now? And it could just not be the case, right? Again, we’re approaching month three of this conflict. Barring massive shifts, this could go on for a year, two years. Obviously it could also expand and other conflicts could become part of this. But I do think China is very critical when trying to understand whether or not Russia is going to use tactical or strategic nuclear weapons.

Rob Wiblin: Sorry, did you just say that you think Xi was opposed to the invasion?

Clay Graubard: No, no, no. That he signed off.

Rob Wiblin: He signed off on it, OK. You think it’s going to come out that China said, “This is OK, go ahead.”

Clay Graubard: Not just that, but that they’re playing the role of the Soviet Union in the Korean War. That they’re materially involved in the party. If we think about this from an international law perspective, who are the primary parties in this conflict? Obviously it’s Russia, Ukraine. Who are secondary warring parties? Belarus on the Russian side, definitely, probably the United States as well. I don’t know how special forces get counted. Maybe the UK as well. If you then consider secondary non-warring parties, I think it’s definitely where China is for Russia when it comes to materiel and getting around sanctions. And obviously on the Ukrainian side, you have the rest of NATO, and the EU as well, and Australia and all those fun places.

Rob Wiblin: Yeah. Robert, when you were trying to come up with that 4% number, what methodology were you using? Was it more like, “I read a lot of stuff and this was my guessed-out judgment,” or was there something more systematic that allowed you to say 4% rather than 10% or 1%?

Robert de Neufville: That’s a good question. I wish I had a really clear base rate for this, but there isn’t. Thankfully we don’t have a lot of good comparisons for it. So I had previously tried to estimate the risk of escalation in other contexts. I was one of the people who thought it was about a 0.4% chance of risk a year in normal times — some people were lower than that — so 1 in 250 with no conflict. That’s kind of a floor for my chance of there being some kind of escalation activity in the course of a year. I guess to some extent, I scaled up the risk essentially, and it’s a little bit different if you’re talking about using a tactical nuclear weapon.

Robert de Neufville: Why exactly did I decide on 1 in 25, rather than 1 in 33 or some other number? I talked to a bunch of people. I have one friend, who’s an excellent forecaster, who thought the risk might be 20%. That seemed way too high to me. I thought 1 in 100 was too low. Now maybe I’m closer to that, but given that they had this active conflict and we don’t know how it’s going to play out, and Russia is clearly willing to do things that I wouldn’t have necessarily thought they were willing to. Did I have a real principled explanation of 4%? No, I just kind of settled on 1 in 25 felt right to me.

Rob Wiblin: Yeah.

Robert de Neufville: As I said, I think it was probably too high at the time, although I didn’t really know some things about the war then that I know now.

Clay Graubard: When you were considering those base rates, Robert, did you also try to find the most comparison-class conflicts? So Korea I think might be on the extreme end, Vietnam, Afghanistan, the invasion of Afghanistan, and then think like, “If I was forecasting at those conflicts, what would I have given for the use of tactical or strategic nuclear weapons?” Obviously we live in a very different world than any of those conflicts, but use that as a way to sort of nudge your forecast in a direction from where it was prior?

Robert de Neufville: Well, I thought about those. I think we do live in a very different world. I think the risk in the Korean War might have been higher; they thought about using nuclear weapons in Korea. I think norms against nuclear weapons use were lower at the time. But there aren’t a lot of comparisons. We’ve been in plenty of proxy wars, but I mean — and some of it is probably racism and lack of concern about people who are in poor countries and everything — the fact that this is happening in Europe, in a large country that Europe is so invested in, that’s really different than a lot of other conflicts.

Robert de Neufville: So I definitely felt like that raised the risk. Russia didn’t need to use nuclear weapons in Syria, even though there are some comparisons there. Why would they do that? They’re winning anyway, and it’s not that important. So I didn’t think there were many very good comparisons. But that’s what alarms me: it feels like it’s really difficult to get a fix on exact percentages in this case.

Rob Wiblin: Yeah. We’re in slightly uncharted waters.

Robert de Neufville: Yeah.

Rob Wiblin: Did either of you spend much time thinking about, not this question of the use of tactical nuclear weapons in Ukraine, but this question of the possibility of escalation to an actual counterforce or countervalue strikes between NATO and Russia, and how likely that was?

Robert de Neufville: I’m not sure, I guess I didn’t think that much about the counterforce, countervalue. It didn’t occur to me that Russia would really necessarily need to use counterforce strike in Ukraine, because there wasn’t likely to be a large Ukrainian army that they were going to wipe out with a single blow. Countervalue, you could see them hitting a city. As far as it would be countervalue, I don’t know what force Russia would target meaningfully that would affect the war with the weapons. Do you disagree with that, Clay?

Clay Graubard: I think if we’re talking about nuclear weapons use between anyone that’s not Ukraine, the odds of that happening before a use in Ukraine, I think it’s not impossible. But just given where everything is right now, that’s relying on too much future information that I don’t even think that’s a worthwhile forecast right now. That would be in the point-something, maybe even point-0 something. We’re just a lot of steps away from that. If it comes to, “Will Russia do an airstrike on Polish territory? Will there be fighting between Russia and Japan on disputed islands, or an attack in Moldova, or in the sea on some aid shipment?”, I think that’s something that I’ve been really sort of interested in following, especially the situation in Transnistria and Moldova.

Clay Graubard: There’s an interesting question on Metaculus that I’m following: whether Moldova and Romania will reunite. They used to be part of one country, which would then give Moldova Article 5 protection because Romania is part of NATO. If that happens before, say, Russia takes any action in Transnistria to go into Odessa for instance, that would sort of give Ukraine a de facto Article 5. So things like that I think I’m more interested in following, and just different flashpoints for escalation. But in terms of a nuclear strike on another power or another base, the only way that could happen before one gets used [in Ukraine] is if Joe Biden announces that we’re going to do a no-fly zone just out of nowhere, and then that’s done in response. But it just seems unlikely.

Rob Wiblin: I think I slightly mis-asked my question. Maybe the thing that I’m thinking is during that first week, things seemed very fluid, and a lot of things seemed possible. And obviously now we have a bunch more clarity on where things were actually heading. But during that first week or two, it seemed possible maybe they would be convinced to do a no-fly zone — the pressure was becoming cacophonous at times. So then I was thinking what is the possibility that, in the fullness of time before this conflict comes to a full close, there could be use of nuclear weapons between NATO and Russia?

Clay Graubard: So I did a partial forecast for that. Nuño Sempere and his forecasting team did an article on whether or not there’ll be a nuclear weapons attack on London by April. So this was the first month of the conflict. I think I was at 1%, and at 2% for a tactical nuke within Ukraine. And for me that was the period where the EU was like, “We got the MiGs for Ukraine,” and then they didn’t have the MiGs, and they might have had the MIGS — and that whole back and forth. That was also the time where we all watched that late-night drone footage of the nuclear plant being attacked, and a lot of people believed that it could be a Chernobyl even though that wouldn’t have been possible.

Clay Graubard: So during that period, I think there was a lot of risk of hysteria, a lot of risk of overreaction. Although that was also the same time that both Russia and the US announced that they had established this hotline between the two sides, which I don’t think was a coincidence — they each wanted to do as much to achieve their goals without going over that nuclear threshold. And so I think it was very anxiety-inducing during that time, because we just couldn’t see everything, like the guardrails that were being formulated. I don’t know. Robert, what do you think?

Robert de Neufville: I agree with that. I think that we have seen them put some guardrails in place, like this deconfliction hotline. Not clear how well that works, but seems to work a little bit, and that’s a good sign that they have that. I think there was a lot we didn’t know about how the course of the war would go. And also the nuclear power plant wasn’t going to be 10 times worse than Chernobyl, but a Ukrainian official — I want to say the minister of defense — tweeted out, “This could be 10 times worse than Chernobyl.” So I’m not an expert on nuclear power plants; I believed that initially.

Clay Graubard: The foreign secretary.

Robert de Neufville: It was the foreign secretary. Right. And I don’t know if that was just hysteria or if that was part of their propaganda thing. But there were a lot of scary noises, and a lot of them were just noise, not signal, but we were still trying to figure out what the signal was.

Clay Graubard: And just on that, which we talked about on Global Guessing as well, is understanding that for Ukraine, if they could achieve an objective right now in the war, it’d be to get the West directly involved in the conflict. And so after he tweeted “It could be 10 times the size of Chernobyl,” he said, “This is why we need a no-fly zone right now.” I think that is important when we’re thinking about information: what are the goals behind where that information is coming from? And just to be aware of that, because I think doing that helps us make each provocation that we see within the conflict less likely to sweep us away into escalation when we otherwise wouldn’t want it, if that makes sense.

Rob Wiblin: Clay, you mentioned Nuño Sempere and his forecasting group did some work to try to estimate the likelihood of a strike on London, a place close to my heart because I am here. Did I hear right that they thought it was 1% likely?

Clay Graubard: I was at 1.3%, and they were at 0.9% if I had my numbers correctly, 0.8%.

Rob Wiblin: That seems really high for such a difficult thing. Because presumably if London’s being hit by nuclear weapons, it’s not the only place. We’re talking about a 1%-ish risk of a massive catastrophe.

Clay Graubard: I think that was also a time when, what is the slope of escalation? Let’s just say this escalation ladder goes from 0 to 10. Going from 0 to 1, and then 1 to 2, 2 to 3 — does the time between each step get faster as you go up each one? Does it get slower? Does it grow linearly? Exponentially? How does that look? I feel like at the start of the conflict — again, when people were on Twitter nonstop just getting almost a live feed of a conflict — that it was difficult to get that. So I don’t think it was actually unreasonable to say at the start this seems to be escalating quickly, if on the very slim off-chance it keeps going at this rate, it could get to that point. So 0.8% then made sense. Whereas if you were to say, “In the next month, is there going to be a strategic nuclear strike on London?”, I think it’s lower now than it was in the month at the start of the conflict.

Robert de Neufville: I would also add there’s some risk of attack on London every year, right? We don’t think about it, but there’s some background risk, and I don’t know exactly what it is. But in the next 500 years, if we were just to go on living the same lives we do, would there be an attack on London? At that point I think yeah, there’s probably a pretty good chance if we just lived in this world for 500 years with no major changes that there might be an attack on London as a background rate. Actually it probably might happen before that.

Robert de Neufville: So even without what was going on in Ukraine, there’s a disturbing background risk of attacks on major cities. So if you start from there and think it’s clearly higher now. I agree that this month is less than the previous months. But if you think about the whole year or something, there’s a nontrivial chance when there’s a conflict like this.

Rob Wiblin: We’ll stick up a link to… They have a name for their forecasting group, right?

Robert de Neufville: It’s Samotsvety.

Clay Graubard: I was going to get that.

Robert de Neufville: I don’t know why? I think it’s like an old Russian band.

Clay Graubard: It’s at forecasting.substack.com, you can find that report.

Rob Wiblin: OK, cool. We’ll stick up a link to some of the things that they wrote around that time.

Most interesting remaining topics on Russia and Ukraine [01:34:36]

Rob Wiblin: What topics are you most interested in today regarding Russia and Ukraine, how it’s going to play out? Maybe Robert first.

Robert de Neufville: I’m really interested in the knock-on effects: food prices, fertilizer prices. There’s I think a real risk of starvation and famines in places. It’s not because we don’t have enough food in the world, it’s because you often get famines when people don’t have the economic resources to buy the food. And that’s a thing that could happen as food prices elevate, fertilizer prices elevate to agriculture. I’m interested in the general effect on inflation and governments, refugee crises.

Robert de Neufville: This kind of stuff makes governments fall, it puts a lot of pressure on governments. I think that incumbents in a lot of places may be under a lot of stress as they deal with the knock-on effects of this. In the longer run, I think that the world order will probably change, that Russia’s role in the world will be different, that Europe’s role will be different. I have some vague theories about this, but I think that the world of 2025 is probably fairly different than the world of 2015. And maybe that happens in every decade, but I think we’re at kind of a pivotal moment where we see that things are shifting, and I’m interested in how that’ll be.

Rob Wiblin: Clay?

Clay Graubard: Definitely all of those forecasts. The knock-on effects I think are very important, not only for what they say about how the larger world is looking, but also this conflict itself. What effect will persistent high inflation, should it continue in the US and Europe, have on their support for Ukraine within this conflict? Same with refugees as well. And then also the rest of the globe: do they continue to straddle the line or start picking sides?

Clay Graubard: I’m also interested in very meta-level questions about this conflict. Again, will there be a peace deal before or shortly after Kyiv falls? Will the Russian government collapse by then? Same thing for the Ukrainian government. Looking at different flashpoints to enlarge this conflict within the context of Russia-Ukraine — whether that’s a no-fly zone from NATO and the West? Russia in Moldova? Action in the Baltics? — I think are all very interesting questions.

Clay Graubard: Then also there are tons of other geopolitical hotspots where both Russia and the US are involved that could either be directly involved in this, or because they happen now, could be brought into that conflict. And the blocs that we see in Russia-Ukraine are going to apply there, which will have different sort of pressure points. So whether that’s Iran-US, Iran-Israel, North Korea-South Korea, North Korea-US. I think there’s a ton in the Indo-Pacific, China and the South China Sea — whether that’s with respect to the US or all of the disputed islands with Vietnam, Japan, South Korea, Taiwan, et cetera. China-Taiwan is something that probably my forecast on invasion in the next five years is I think double or triple what Metaculus is at right now. That’s something that I am very worried about.

Rob Wiblin: I’m not happy to hear that.

Clay Graubard: The scope of who’s involved within this conflict, right? Who are the parties? What are the roles, and how are those changing? So just getting the shape of the conflict. And then I have another forecast that just lingers in the back of my head that I worry borderlines on conspiracy theory.

Rob Wiblin: Go for it.

Clay Graubard: I haven’t fully thought it through. If I had to put a probability on it right now, it’s probably only 1% or 2%, but that China could invade and take action in Taiwan, and that the actions of the world since 2020 onwards — starting with COVID into Russia-Ukraine — was actually a start of a world war, given all the parties that were involved.

Clay Graubard: And that’s a very complex forecast relating to the origins of COVID, the impact that that’s had on inflation in the US, increased debt. Right now you see the US depleting a lot of its military stockpile and being really involved in a conflict. Whereas in 2018, a nonpartisan Senate commission said that the US can no longer fight a two-front war and defend its homeland, which was longstanding the basis of US military policy. And I know probably a lot of people are going to think I’m crazy for thinking that, but I don’t know, I think that it’s a very dark world. Robert, do you think I’m crazy? You can just say it, it’s fine.

Robert de Neufville: I mean, I’m less worried about that than you are. I actually think that what’s happened in Ukraine has made a Chinese invasion of Taiwan less likely in the immediate future. They have seen a little bit of some of the consequences. I also don’t think China is ready to do that, and is doing pretty well with its patient salami slicing strategy, where they just kind of push people out slowly. They would like Taiwan to agree to be part of China. That’s what they would really like. They don’t really want to occupy that, that doesn’t work as well for them. But I’d have to think more about that. We could have another podcast on the China-Taiwan issue.

Robert de Neufville: I’m less worried about a world war, but I am worried about trends of rising authoritarianism. I worry about whether democracy will survive in the United States in the long run. I think there are a lot of pressures on it. I think most of inflation in the US is not really caused by the invasion of Ukraine, but it’s potentially caused a lot of turmoil in the next four-plus years, and I have concerns about how that plays out. And some of the same issues are happening in other countries around the world. So there’s a part of me that wonders that the world is going to a darker place too, but I’m less worried about war than I am the rise of authoritarianism.

Clay Graubard: We’ll definitely have to dig into Taiwan later, because I do agree that you were right before the invasion. Whereas China originally would’ve been looking for a window of opportunity to invade Taiwan, one of the ways in which it’s trying to use this force. That, because of what’s happened right now, should they wait until, say, Russia loses in Ukraine? That they’ll actually say this is a closing window and that their future probability for success is lower now than it was before Russia invaded Ukraine, and that could alter their calculus on when to take action.

Rob Wiblin: We’ll have to press pause on that because there’s a lot of issues there.

Forecasters vs. subject matter experts [01:41:03]

Rob Wiblin: We’re close to finishing up. I’ve just got a few more questions. One is a question that I’ve seen floating around recently: this issue of the expertise of forecasters versus the expertise of subject matter experts. Obviously someone who studies Ukraine and Russia all the time brings some knowledge to the table, whereas you guys aren’t especially informed about Ukraine or Russia in particular, but you have this particular knowledge about forecasting and base rates and how to estimate probabilities of things.

Rob Wiblin: As you were going through this, did you think that you would have been able to produce better forecasts or faster forecasts, if you’d been paired up with someone who was knowledgeable about how Ukraine and Russia and the Minsk Accords and that kind of thing? Maybe the combination of your expertise in forecasting and their expertise in the specific issue could have brought a special magic to the predictions?

Clay Graubard: So I’ll just start off, first of all, and say that the attack on hedgehogs I think has been oversold. I think that hedgehogs have a lot of value, especially when you put in as much time as I might have done in a forecast, that actually having that theory background was helpful. I would say my background in international relations was helpful in doing the forecast. And definitely having an expert. It’s either I can go ahead and read journal articles and go do hours’ worth of research on a topic, or I can ask someone who’s done that for a job — or they’re the author of the paper, and instead of figuring out what they said, I could hop onto a 30-minute call and get probably way more information than I could just from reading their words on a sheet of paper.

Clay Graubard: So I think domain experts could make good forecasting. A lot of it is the process you take to making your forecast, right? Obviously having a forecast that is heavily rooted just in theory and in a very rigid worldview will do poorly. That is what expert political judgment said. But there was also a recent post on the EA Forum about how you can train domain experts to become good forecasters and they can reach Superforecaster levels of Brier score. It’s like, I want this piece of information. Now either I can go research it, or I can find someone that has that context. Then they can say, “Well, you’ve thought about this. Because of that you also want to consider this, because in this historical analogue that was relevant, or you’re just missing this piece of the puzzle.”

Rob Wiblin: Robert?

Robert de Neufville: I agree. Domain expertise doesn’t make you a bad forecaster. There are domain experts that are good at forecasting. There’s some risk that if you really are attached to a certain perspective, it’s going to be hard, but we have political scientists who are great at forecasting politics. I think someone who really has a single framework is the kind of hedgehog that maybe doesn’t do as well. I would like to have more discussion with people with expertise that I don’t have. I think there’s a project looking at existential risks, trying to forecast that, in which they’re going to be pairing domain experts with forecasters. I think that’s a good idea.

Rob Wiblin: I think Phillip Tetlock is actually recruiting for that at the moment.

Robert de Neufville: That’s right.

Rob Wiblin: We’ll stick up a link for people. I think it’s an experiment where they’re trying to pair up people who have experience with forecasting and people who know about specific existential risks, and then see whether they did better together than apart. We’ll stick up a link with more information about that experiment for people who might want to participate on either side of that equation.

Rob Wiblin: Hey listeners, Rob here. Just an update that unfortunately the hybrid forecasting tournament applications closed on May 13th, 2022, so people won’t be able to apply for that one. But if Tetlock starts any experiments that you can potentially join, then we’ll be sure to let you know either here on or the main 80,000 Hours Podcast feed. Alright, back to the episode.

Robert de Neufville: I’m really excited about that project, because they’re trying to forecast things that are harder to forecast — further out in the future, more rare and unprecedented things — and I think that’s really important. So I’m excited about that project.

Robert de Neufville: I often wish I could ask specific questions of experts that I don’t know the answer to. I might be able to do the research, but maybe I wouldn’t get the right answer, and it would probably take me a long time. For the nuclear weapons use thing, I wanted to know, how does it work? That’s not a thing I knew; I wanted to talk to someone. That’s something we can kind of do. We can just play journalist or call a colleague and ask them that question, but there are specific questions. The other value of people that have expertise, that you may not have as a generalist forecaster, is that I want them to look at my rationale and say, “It looks like you missed something. Here’s a thing that you didn’t think of that everyone in my field knows, and you missed it and that might inform you.”

Robert de Neufville: I’m often not very interested in the probability estimates of domain experts. Not because they’re necessarily bad forecasters, just that most of them are. There are some that are good, but just because you know something doesn’t mean you’re good at estimating the likelihood that things will happen — it’s kind of a different skill. So I usually want answers to specific questions, and sort of sanity checks of my work. I think there was a response to the Samotsvety forecast by Peter Scoblic that was useful. I don’t know if I thought all of his probability ideas were great, but the information in there was a productive dialogue, I thought. So I do think that there are conversations to be had.

Rob Wiblin: He’s an expert on nuclear weapons in particular, and he was looking at this forecasting effort and saying, “Here’s what I would say, as someone who knows a bunch about this specific area?”

Robert de Neufville: Right. And I thought that not all of his insights were useful necessarily, but I thought the dialogue was productive. In general, I think that the more forecasters can connect with expert knowledge, the better the forecasts are going to be.

Ways to get involved with the forecasting community [01:46:49]

Rob Wiblin: For people who are really excited about the kinds of things that we’re talking about, what’s a step that they can take to get more involved in the forecasting community?

Clay Graubard: How I started off is I read Superforecasting, which I think is a really readable, great book. Both Dan Gardner and Phil Tetlock are great writers, so it’s a compelling read as well. It doesn’t take very long. I think that’s very good. And then from that, the resources out there on becoming a good forecaster are decentralized. As a shameless plug, you can listen to some Right Side of Maybe podcast and listen to expert forecasters, or read one of our forecasts on Global Guessing. They’re thousands of words long sometimes. You might say we go too much in depth, but we really like to fully account for our reasoning process.

Clay Graubard: Then after that, I would just jump in and forecast. And there’s really two approaches that I’ve seen people take. One is to go for quantity. The approach that Andrew and I took to get better at forecasting is doing a few predictions, but being very confident in our numbers — knowing all of the inputs that have gone onto it, being able to identify the constraints and the objectives of different actors, identify what are the signals that we would look to update on this prediction. Doing a pre-mortem: do your forecast and then say, “Let’s say this forecast was actually really wrong. Well, what does that world look like?” And really do a few forecasts and do them very well, and then go back and update your forecast, and then do an in-depth analysis afterwards to sort of check your thinking, and then make progress that way.

Clay Graubard: So you can go the in-depth route, or you could just try to make a lot of predictions, go on Metaculus or Good Judgment Open, reach an initial probability. Use something called base rates, which is the historical occurrence that events have. Figure out your own research, your own method. Maybe compare: the community is at 60%, the base rate is 30%. I did some reading, I can kind of see where the community is, but I think they’re a little bit high. And then take the low end of what the community is saying and do a lot of forecasting that way.

Clay Graubard: But I do think practice. You just learn so much by doing forecasting, especially because people have been forecasting forever, right? But this idea of making it into the scientific field is relatively new. There are the 10 Commandments of Superforecasting out there, but other than that, it is the art and science, right? The art of prediction. And so practice, and figure out what works for you when it comes to forecasting.

Rob Wiblin: Robert, is there a Discord perhaps, or some other experiments other than the one that we just mentioned that people could sign up for?

Robert de Neufville: I don’t know of a Discord, but I probably wouldn’t because I’m just too much of an introvert. I would recommend something like Good Judgment Open or Metaculus. But I agree: I think practice is the main thing. And those are the places I would go to talk about forecasting a lot too, because they do talk about a lot of it.

Robert de Neufville: And this isn’t the message of Superforecasting at all, but sometimes I think that the hype about Superforecasting suggests that it’s some kind of a magic thing, that people have some special ability. It’s just a skill: it’s like learning how to shoot a basketball or something. Maybe some people have more aptitude for it, but it’s a thing you can practice and develop your ability, and you get more judgment about how to hit the hoop and what physical motions to go through. You practice that and you go on Metaculus or something. And people will have different ideas than you, and tell you why they think your ideas are wrong — hopefully in a nice way. Then you learn something from that and you agree or you disagree, but you see what they’re doing and you get better at it. I think that’s the main thing.

Robert de Neufville: And there are a lot of resources. There’s Clay and Andrew’s podcasts, there’s our podcast. There’s a lot of discussions of forecasting in different places on the EA Forum. There are plenty of resources out there. You can look at Good Judgment’s materials where they try to teach you there. They have their training materials that you can check out. So if you try to do it in a methodical, thoughtful way, and practice and read what’s out there, you can go a long way I think.

Clay Graubard: And I’ll also just throw in, if possible, forecast with someone else. I find my best forecasts are not the ones that I’ve just done by myself. Russia-Ukraine, I forecasted with numerous people at numerous times, it was like four sanity checks. Just working with someone else that has a different viewpoint, they’ll raise other things. And having that back and forth I think can be really fun.

Clay Graubard: I will warn you if you start doing that, then when you’re out with friends and they’re talking about events, you’ll be like, “But what’s your probability on that?” And then they’ll be like, “Clay, we’re just trying to have drinks. We don’t have to put a forecast on it.” And you’re like, “No, but you need a Brier score.” And then you slowly don’t have any friends, but that’s OK.

Rob Wiblin: You’ll make new friends. Internet friends.

Clay Graubard: Make new friends. But I do think trying to forecast with other people is great. Then also there’s a lot to be made about updating forecasts, and that’s great. But if you lose interest with a question, don’t feel like you have to follow it forever. You can just say, “I’m making a fuzzy set adjustment,” don’t even worry about that score. Don’t necessarily make your track record the only thing that matters. If you lose interest with a question, really, it’s fine.

Robert de Neufville: If I could add to that, I think one of the most exciting things about forecasting is talking to other forecasters and people who are good at it. It’s great working on the Good Judgment Inc. forecasting site, because I learned so much from the other forecasters. They do great research. So if you can get to be part of one of those communities where people are thinking about these things, it’s a great way to think about things. I just learned a huge amount from it.

Rob Wiblin: Yeah. I don’t spend that much time on the forecasting platforms, but I say things on Twitter, and I think you learn so much from the pushback you get. It’s not even people being harsh or disagreeing, they’re just like, “You are not aware of this fact. Here’s a link to this factual information that is incredibly useful and pertinent to what you just said.” One of the fastest ways to correct your mistakes is to tell them to a large group.

Robert de Neufville: That’s true.

Clay Graubard: Especially if you put a number, right? Because you say a thing and then you’ve put a number to it, so then people can quantify what you said. Otherwise, with your vague verbiage, you could be like, “When I said ‘low probability,’ I really meant 33% because that’s low.” And it’s like, “Wait, no. What?”

Impressive past predictions [01:53:29]

Rob Wiblin: All right. We’ve gotten to the end. Just a final question for both of you: what’s a prediction that you’ve made over however many years you’ve been at this that you are kind of proud of in retrospect? Clay?

Clay Graubard: Well, obviously Russia-Ukraine. I mean, “proud” is a very weird way to feel about getting something like that correct. But in part because I incorporated international relations theory, which is something that I think as we try to push the accuracy of our forecasts will be very important. And just taking a very different approach, right? Multiple hours a day versus a few hours a week on multiple questions. So getting that right.

Clay Graubard: But as well, last year on Global Guessing, Andrew and I correctly forecasted that the US would not rejoin the Iran nuclear deal. Not only did I think we did a very good job of identifying the constraints and all the signals to look out for and how we updated that forecast, but also we did a pre-mortem analysis. At one point we were at 20% and it’s like, well, what if the probability today was actually 80%? Can we tell that narrative? Can we make a compelling forecast? Then with that new perspective in mind, how should we update our main forecast? And really trying to get a really good grasp on this. I would say that I was quite proud of that forecast.

Clay Graubard: Any attempt at forecasting in my personal life has just been horrible. My personal Brier score is so bad; I’ll give 5% chances to things and they happen all the time. I am so bad at personal life forecasting. So if you ask me for a personal life forecast, I will give you one, but do not trust it whatsoever.

Rob Wiblin: Robert?

Robert de Neufville: It’s interesting about my psychology: I realize that the ones that stick in my mind are the ones I’m ashamed of. I don’t feel that proud of it, because I think it should have been obvious, but I sort of knew that COVID was going to happen earlier than other people, and we were going to lock down, and you shouldn’t travel, and conferences are going to be canceled. I made a bunch of personal decisions about that early. And I don’t know why everyone didn’t know that on some level, but people would react like, “Why aren’t you going to that conference?” “Why aren’t you coming to the gym?”

Rob Wiblin: “Well, the obvious pandemic that is beginning?”

Robert de Neufville: Right. That’s what I thought, but it wasn’t obvious to other people. So I remember at one point I was like, “Everyone’s going to have to lock down soon.” A friend of mine works in a coffee shop and his reaction was, “What are you talking about?” And I was like, “How does everyone not know this?” So I had this moment of like, I actually know something that for some reason other people don’t.

Rob Wiblin: I actually remember walking along the street to work, and it was late February or early March. I was just going to my coffee shop. I talked to the barista as I do most mornings. And I was like, should I be telling all the people here that they’re not going to be working in a few weeks? Just feeling like, I don’t know how to have this conversation. It was an absolutely surreal experience that really just sticks in my mind of, where do I begin to explain this?

Clay Graubard: One more forecast I really liked. We got it right, but we forecasted the 2020 Ghana presidential election. If you look at the polling, it wasn’t necessarily going to be a close one, but it was a really interesting forecast because of the events that happened afterwards. We had a contested election, so I think this was a precursor or postcursor to the US election. But just following and forecasting that election, I paid much more attention to Ghana politics. I felt like I learned a lot about the country and its political systems. And when all of the election stuff was happening, I was glued to Ghana politics for two to three weeks in a way that if I didn’t forecast the election, I wouldn’t have followed it. So I think that’s a nice part about forecasting, is you learn a lot about these areas, but you also —

Rob Wiblin: You broaden your knowledge.

Clay Graubard: — become more engaged as well. You’re more engaged and curious about it. And I think that was really cool.

Robert de Neufville: You learn a different kind of things — sort of look at things in a more serious way. If you’re not trying to forecast them, you might just pick up random facts, but you have to actually think about it in a methodical way. And I agree, I’ve learned so much more than just being a casual news consumer when I try to forecast a question.

Clay Graubard: And much more diverse as well. At least if you’re forecasting politics in the US, if you’re left-wing like myself, you have to at least understand the other worldview, and understand how someone reaches it, and what that worldview means if you’re going to have success forecasting. Which I think has benefits on being a good citizen as well.

Rob Wiblin: My guests today have been Clay Graubard and Robert de Neufville. Thanks so much for coming on 80k After Hours, both of you.

Robert de Neufville: It was a real pleasure. Thank you.

Clay Graubard: Yeah. Thanks so much.

Keiran’s outro [01:58:45]

Keiran Harris: If you want to hear more from Robert and Clay, you should definitely check out their podcasts.

For Robert, that’s NonProphets — that’s prophets with a ph.

And for Clay, that’s The Right Side of Maybe, as well as the weekly Global Guessing podcast.

You can find all of those wherever you listen to this show. I mean I didn’t actually check that you can find them in literally all the same places — but that’s what everyone always says about other podcasts. You probably can.

And things have been quiet on this feed lately, but we’re about to ramp up our After Hours content — so keep a lookout for new releases over the next couple of months.

Alright, audio mastering and technical editing for this episode by Ben Cordell.

Full transcripts and an extensive collection of links to learn more are available on our site and put together by Katy Moore.

And I produce the show.

Thanks for listening.

Related episodes