Robert Wiblin: Hi listeners, this is the 80,000 Hours Podcast, where each week we have an unusually in-depth conversation about one of the world’s most pressing problems and how you can use your career to solve it. I’m Rob Wiblin, Director of Research at 80,000 Hours.
If this show has at all changed what you plan to do with your career it would be a huge personal favour if you could let me know by filling out 80,000 Hours’ impact survey, which I’ll link to prominently in the show notes or you can fill out at 80000hours.org/impact-survey/.
It should only take about five minutes, and I’ll read every entry.
As the show progresses we’re going to do more deep dives into pretty specific issues. If you don’t have much interest in the topic of a particular episode, you shouldn’t feel any duty to listen to it. Much better to skip episodes you aren’t going to enjoy than get sick of the show and unsubscribe.
If an episode isn’t going to be useful to you, maybe you’re better off spending that time listening to an audiobook instead. I’ll put a link to some audiobooks that I feel have allowed me to have a bigger social impact in the show notes and the blog post associated with today’s show.
That said, this week’s episode covers a diverse range of topical issues, and so I suspect it will be of interest to most subscribers.
Here’s David Roodman.
Robert Wiblin: Today, I’m speaking with David Roodman. David studied theoretical maths at Harvard, graduating in 1990. He then spent nine years researching environmental policy at the Worldwatch Institute in DC, followed by eleven years researching development policy at the Center for Global Development.
He was briefly a Senior Economic Advisor at the Bill & Melinda Gates Foundation before becoming a Senior Advisor to GiveWell and then the Open Philanthropy Project in 2015. He’s done world-class reviews on topics as diverse as the risk of geomagnetic storms, the effect of incarceration on crime, whether deworming improves health and test scores, and the development impacts of microfinance.
So, thanks for coming on the podcast, David.
David Roodman: It’s great to be here.
Robert Wiblin: So, we planned to talk a lot about the methods you’ve used in your kind of careful analysis of these questions and what advice you’d give to people who would want to become researchers themselves. But first, how did you transition from studying theoretical maths back in the ’80s to the kind of tricky social science you do now?
David Roodman: I did it without a grand plan. I did not know exactly how it was going to work out. There was a lot of self-doubt and worry about direction along the way. I studied math in college because that’s what I was good at. That’s what I got the best grades in, and that’s what made me feel secure. I’m not sure that was really the best basis on which to choose one’s path, but I guess I needed that sense of security.
And I almost completely avoided thinking about what I would do once I left the ivory tower. I don’t recommend that to the listeners, necessarily. Because I was interested in English folk dancing, I wanted to go spend some time in England. As I left Harvard, I got a fellowship to study at Cambridge, where I also signed up for a one-year maths program, the Tripos, Part III. I figured I’d been studying math all these years, I can do it for one more, and it’s a way of punting on what I want to … figuring out what I want to do when I grow up.
What I didn’t expect was that once I arrived at Cambridge and came to terms with the fact that I really didn’t love mathematics enough to do it long-term. I think I wanted to do something more connected to the real world, as it was. Once I’d realized that, it was really hard to motivate myself to stick to my studies, because it now it just seemed so irrelevant, in fact, a barrier to my figuring out what suddenly was a very urgent question: “What do I want to do next?”
And so I became very interested in questions I hadn’t thought much about before. “How does the world work? What are the grand problems of our time, and what’s causing them?” And then, in a funny way, that was the macro question, but then there was a micro question of, “Where do I fit in?” And they felt linked somehow, you know, even though I couldn’t fully explain it. I lost interest in the classes. I remember one day I got a long letter from my girlfriend, who’s now my wife, just before a class- a lecture. I sat down, I read it and missed the lecture. I felt so good I never went to another one, another class, and started reading books that friends pointed me to. Then I found myself E. F. Schumacher’s Small Is Beautiful, books by John Kenneth Galbraith, Herman Daly and so on, which are all about broad questions of economics and ecology and ethics, and really exciting. But I didn’t know where that would lead me. I didn’t read those books and say, “Aha! I should go get a PhD in economics.”
Partly because I perceived economics … this was in 1990, as being very theoretical and mathematical in a way that didn’t impress me as a mathematician. And I guess I saw it as arrogant. More sure that its models were correct, even if they conflicted with reality. And so I felt passionate about these things but unsure of where it would lead me.
Robert Wiblin: So did you want to switch into more practical or applied questions for moral reasons? Or …
David Roodman: That’s a really good question. I mean, what I know is, I felt a need to be working on things that seemed more practical. I wanted to work on important things, which was of course distinct from how important my work would be. Whether that was out of a need for a certain kind of identity and self-esteem, or purpose, or morality, I’m not sure. I don’t think it was moral in the sense that I did an abstract analysis and determined that this is what a moral being should do in this position. But it was … I had a sense that I needed to move in a direction of more practical relevance.
And what wasn’t clear to me is how to link that with my aptitude for programming and mathematics. In fact, I didn’t assume that there was a link, and I only found that after 10 or 15 years into my career.
I followed my girlfriend, having … so I ended up actually failing my exams at Cambridge. I was required to take them so I sat for the minimum 20 minutes for each exam, and there’s one professor I kind of liked so I wrote him an apology, and couldn’t believe what I was doing. I’d been an overachiever all of my life. It was not … it’s a great story but I didn’t feel good about it all. I was actually kind of scared this was going to be a stain on my record. But I felt driven to do it.
So then after that, I followed my girlfriend back to Philadelphia, where she was in medical school, and started looking around for some job for an organization working on local environmental issues, without any clear sense of what I, a math major, would have to offer them. Eventually I found somebody who was unwise enough to hire me and had a great year working in a very small non-profit in a pretty rough part of the city, learning a lot about poverty there.
After a year, I … maybe this is a really interesting theme here. After a year, my girlfriend told me that I needed to move down to Washington, DC, to find something. And I reflect on it now, that’s interesting, because my general approach in life has been to figuring out … trying to figure out whether I’m thriving now or where I can go to thrive without a long-term plan. Whereas my girlfriend was saying, “You know what? If you want to be here in ten years, you’ve got to do this now.” Without that nudge from her, I might not have gone down to DC, which turned out to be a very good move.
I eventually did find the first job that you mentioned at the Worldwatch Institute, where I started out as a research assistant and moved up to a senior researcher, writing about various environmental issues. So, deforestation, energy policy, and so on.
Robert Wiblin: How did you manage to avoid kind of discovering the big picture questions before, I guess going to Cambridge at the age of 22?
David Roodman: That’s an interesting … I mean, I was exposed to them in elementary school, actually. I had a wonderful teacher early on. So they were always there in the back of my head. It could be that being at Harvard, I didn’t feel safe venturing out at a certain safe zone. I mean, you know, it was a liberal arts education. I studied a lot of different things. But it was too competitive for me to want to explore as much as I should have, probably.
It could also be some skepticism, as I mentioned, of economics in particular, as a field. Like I thought of it as a place mathematical and model-heavy and I didn’t yet appreciate how interesting the big questions are in economics.
Robert Wiblin: So, at Cambridge, were you having kind of an existential crisis and couldn’t motivate yourself to study? Or, it sounds almost as if you wanted to fail on some level, because that would force you to do something different. Force you in a different direction.
David Roodman: Well, on the plane out to San Francisco, just a few days ago, I was by chance looking at emails I’d written when I was at that period of my life. And I’d forgotten, there were a few days of, yeah, what you might call an existential crisis, which came not when I first went off the rails, but when I thought I needed to inform the fellowship committee that was financing me to be there, and out of a sense of integrity I wanted to tell them what I was doing, but also then question my integrity in not doing what I had promised to do. And that was very rough. I think it wasn’t that I wanted to fail as much as this was a life raft. This was somehow … this, I couldn’t explain it but this was what would lead me to find meaning in life.
Robert Wiblin: It sounds like your wife has been kind of important in your career. Is she working in the same area, or just a generally good advisor overall?
David Roodman: No, she became a doctor and … but didn’t practice much, and now she spent most of her time working first at a think tank on health policy and then working for Medicare and implementing part of Obamacare. And now she’s an executive at a big health insurance company.
Robert Wiblin: So, you’re someone who a lot of people trust to do kind of the most difficult empirical research, in effect [inaudible 00:07:57], where there’s either a lot of contentious evidence that has to be pulled together and reach a conclusion, or perhaps there’s very little evidence and so we need to get kind of as much juice out of it as we can. I think Holden has called you, “The gold standard for in-depth quantitative research.”
So how do you think you got to be that good?
David Roodman: Well, I won’t argue … and I won’t agree or disagree about whether I am that good. But I think that goes back even earlier in my life. Maybe I was just born with a certain sense of responsibility to the truth, as it were. But I am the child of a bitter divorce. My parents split when I was ten, and I grew up from that point on with the experience of there being these two gods in my life. They collided, and I couldn’t make sense of who was right and who was wrong when. But also felt a lot of fear about what happen if I chose sides? Alienating and losing a parent.
So I felt this very strong compulsion to go down the center, and if I ever strayed from the center, to be really well prepared to explain why I was doing it. And I think that actually drives my approach to researching. I’m really afraid of being wrong and so I always want to dig down the next level. And that’s part of how I’ve developed my style.
And this is a style that I feel like I’ve discovered. There was no grand plan, especially working for GiveWell and Open Philanthropy in the last three or four years. What I do that’s unusual is to review empirical research, mostly in the social sciences, and as much as possible, re-run the studies that I read for myself. I’ve hardly ever done new research, but I will go back to original data sources, try to understand the methods that were applied, re-do them, and then think critically about whether I agree with those methods or I want to apply alternatives.
And that arises both from my personality, as I already said, and also I think the fact that I don’t have formal training. I never did get a PhD, and I think probably people who come through PhD programs pick up a different kind of culture and maybe face different incentives, which discourages the kind of work I do. So I’ve kind of stumbled into this.
Robert Wiblin: How do you think it discourages it?
David Roodman: Well. Economics, or I’m sure other disciplines are … the field is a community of human beings, and that means it’s political. And especially if you’re a young person trying to make your way in the field, it can be dangerous to go around, you know, pissing people off by challenging their work, especially if they’re senior to you. Getting the good jobs is a very competitive process and I would imagine that people are risk-averse. And so the incentives are to do new research, you know, which might mean getting new data or thinking of new questions or developing new methods, rather than going around and challenging existing work.
Robert Wiblin: Do you think the political incentives cause people to believe the wrong thing, or just kind of act in a strategic way?
David Roodman: What do you mean by believe the wrong thing?
Robert Wiblin: I mean, are they successfully kidding themselves, or do they realize that there are these incentives, and … you know, maybe they believe that their advisor is wrong or using bad methods, but they just keep … they just bite their tongue.
David Roodman: Well, I think, especially the best people in these fields are very smart, and they see clearly, and no one knows better how the sausages are made than the sausage-makers. So they understand the problems.
I don’t think what I’ve just given you is the whole story. We need lots of people doing fresh, original research, and it’s … and maybe that’s where the best minds should be. But I think incentives are part of what’s going on.
Robert Wiblin: Do you think if you try to reduce the politics in an organization or a field, do you just kind of get different problems?
David Roodman: Oh, gosh.
Robert Wiblin: Is that something we should aim to do, or do we perhaps not realize the benefits that you get from politics?
David Roodman: Oh, gosh. I haven’t thought about it. My assumption would be that it is hard to change. That it’s an aspect of being human and it’s kind of wired in us. I don’t know. I haven’t thought about that.
Robert Wiblin: I think there’s a researcher, I think … Weingast? I’ll put up a link to some of this discussion. I think his view is that politics is a way of avoiding kind of outright violence, so you get these games that people play, but the alternative would be outright conflict, so … you shouldn’t only see the downsides of political processes and game-playing.
David Roodman: That sounds right to me. I mean, I said political behavior is human, and presumably it’s human because that has adaptive value, at least in our evolutionary history.
Robert Wiblin: So, do you think you’ve learned most of what you’ve learned just by trying to replicate these studies, and I guess in the process, you learn all of the methods that they’ve applied, and maybe even a little bit more?
David Roodman: That’s absolutely right. I have never taken a class in economics or statistics, but replicating existing work is actually a wonderful way to learn this stuff. It’s kind of like a scaffolding, and so when I did it for the first time, when I was at the Center for Global Development, which is a think tank in Washington that focuses on what … we used to call Third World Development. I was working under Bill Easterly, who’s now a pretty famous critic of foreign aid, among other things.
And he had me replicate what was then a very influential study on the impact of foreign aid on economic growth in the countries that receive it. And he used pretty elementary methods, I now understand, but they were totally new to me. But to have the paper on my left and the textbook on my right and the computer in the middle was a wonderful way to step-by-step learn the methods, and it’s been that way for me, throughout.
The one thing that might be more distinctively about me is that I’m a good coder, and there’s been several times when I have needed a program to run the methods that I was trying to copy. Wasn’t available to me, at least in Stata, which is a program that I’ve always used. So I wrote my own program, and through that learned more about some kind of family of methods, and then some of those programs have become popular.
Robert Wiblin: Do you think that most people can learn statistical methods this way, or are you just particularly smart or particularly well-suited to it?
David Roodman: What I think is that majoring in math actually worked out pretty well for me. Even though I didn’t do it for particularly good reasons, I think it might be a little bit like what they say about Latin, is that it teaches you some really useful skills in … you might call slow-mode thinking. Deliberative thinking. Which transfer to lots of areas.
I think that that is part of what happened for me. No, I don’t think that everybody would be able to do it as well as I have. Nor need they. By well, I mean, not everybody needs to skip formal training in order to do what I do. Go … if you know in advance that you love to do lots of replications, the skills you need can very easily be gotten say, through an economics PhD.
Robert Wiblin: So, when you were looking at the effect of foreign aid on economic growth, what kind of methods did you learn then? Just like, linear regression, I guess, but the other stuff as well?
David Roodman: That’s right, yeah. Most of the studies use straightforward linear regression with controls. That’s called ordinary least squares. Metaphorically, you’re just fitting a straight line to data. And then a lot of them also used what’s called instrumental variables. And there’s a simple form of that called two stage least squares. The idea is, you’re trying to find something … so we’re interested in the effect of foreign aid receipts on economic growth in receiving countries, but we’re worried that there could be reverse causation, for example, which would then mean that correlation doesn’t imply the kind of causation we’re interested in.
And so what you try to do is find a variable that could only affect economic growth via foreign aid receipts, like maybe the country happens to be a geopolitical … geopolitically important to the United States. So that might be Egypt, or Pakistan, or something like that. And as a result, gets unusual amounts of aid, and that constitutes a kind of natural experiment.
So the method there, instrumental variables, is a way of setting up that kind of experiment and only looking at the impact of the foreign aid that is explained by this deeper determinant.
Robert Wiblin: I guess you would have been doing these replications fairly slowly to begin with. Why would people put up with that, have you as a research assistant very gradually learning statistics and just replicating papers that already exist?
David Roodman: Oh. Well. It depends on what you mean by slow. I mean, this project, that first one, the paper was by Burnside and Dollar, it was published in 2000. I mean, we did it over a couple months. Part of my job was to build the data set, which I could do quickly. That was straightforward. Part of it worked out just because I was cheap. So it didn’t cost Bill much … Bill Easterly much to send me off and work on something and then not think about it for a while.
Robert Wiblin: Did the paper replicate?
David Roodman: No. And Bill kind of suspected that it wouldn’t. We … the sample was, I don’t know, 70 or 100 developing countries studied from about 1970 to 1993. When I went back and rebuilt the data set, I was able to add some more countries and also add more years of data, since time had passed. So when we re-ran the numbers with the expanded data set, the result just flipped, and now it seemed … the result was that foreign aid works, in the sense of increasing economic growth in countries that have good economic policies, which is a very influential result because it gave foreign aid agencies a recipe for effectiveness, seemingly.
We found when we added data that got a negative sign on the key term, and now it seemed as if giving more aid to countries with good economic policies actually slowed their economic growth. And we didn’t believe either one, really.
Robert Wiblin: So, we’re talking here about data replications, rather than experimental replications. You didn’t find another 70 or 100 countries to re-run the experiment. But what’s involved in doing a data replication?
David Roodman: What’s involved in data replication depends a lot on where the data come from. If they come from public sources, then it’s about downloading the data that you need, integrating it into a single file. I tend to use a relational database for that but you don’t have to. And inevitably when you actually get down to the fine details of doing that, questions arise. Ambiguities that you don’t appreciate until you try to copy something.
So then you usually have to send a set of questions to the original authors who may or may not answer, how helpful they feel like being [crosstalk 00:18:05]. How much they remember.
Robert Wiblin: May or may not feel enthusiastic about someone doing …
David Roodman: Or maybe they’ve … yeah. I’ve been told several times that, “That was a long time ago, the data are lost.” That kind of thing.
Now, increasingly I’m seeing studies … I guess I saw this especially in my work on criminal justice reform, looking at the impacts of incarceration on crime. Researchers are using much bigger data sets that are what we call administrative data. So, you know, a prison system or a school system. Lots of big government agencies are constantly aggregating data. Or, not aggregating it. Collecting data. And these … you know, at the student level, or the prisoner level.
And these may not be in the public domain but they will license it to researchers under restrictive conditions. And that can be much harder to obtain. We either have to go through the same licensing process, permission process. Or, in some cases, the original authors can pass on the data to us. They have permission to do that. But that can be a more … be a tougher thing.
Robert Wiblin: I was thinking with that question, most of the time if you’re trying to replicate a paper using more or less the same data, if you just run exactly the same analysis, then most of the time you get the same result. I guess they could have made a mathematical error or a coding error, in which case, that’s not true. But it sounds like you’re doing more than that. You’re also fiddling, or you’re changing the methodology a little bit. Seeing, does it hold up when you do it in your preferred way rather than their preferred way.
David Roodman: That’s right. There is a meme out there that research is not replicable. Meaning that when you do the study again, you just get a different answer. There’s a worrying study that was done in the field of psychology where they re-ran like 50 experiments or 100 experiments, which were all relatively small-scale, probably 50 to 100 subjects. And they just got different results a large percentage of the time. I don’t know the specifics.
And so there’s a concern that psychology research is simply not replicable. There have been a couple papers in economics arguing the same thing, although the one that comes to mind by some researchers at the Federal Reserve, counted a study as not replicable if when they contacted the authors, they never received the data after that.
I find in general, that if I go back to the original data sources and try to reconstruct a data study, I never get an exact match unless I have direct access to the researcher’s data and code. But it’s actually the exception for me to get a contradictory, a fundamentally contradictory result.
Most of the time I get something that’s close, and I say, “Yeah, looks like their analysis stands up on its own terms.” And so, the interesting stuff is then come to questions like, “Well, is this robust?” If we make small changes, does the result go away? Or there may be questions that are more specific to a given study. “This researcher says that there’s a fingerprint in the data of a particular intervention. Am I convinced that that fingerprint is really there when I test it in a way that is convincing to me?” That kind of thing.
Robert Wiblin: So, is it possible to generalize about what fraction of the time the basic findings hold up and what fraction of the time that they’re not convincing to you? And maybe at what point to do they fall by the wayside? Is it when you’re fiddling with specific analytical choices or something else?
David Roodman: I have found, at least in my work in the last few years for GiveWell and Open Philanthropy, that it’s been about 50% of the time that I reconstruct a study and end up still believing it. And it’s a pretty small sample. I have a tentative hypothesis that research is less reliable when it comes from a young researcher who is under very intense incentives to get that statistically significant result, which gets you into a good journal, which helps you on the tenure track. That’s a tentative theory.
Usually, the problems do come up, yes. After I have replicated the basic … successfully replicated the basic result and then I start to probe. And I’m thinking of cases in my head, and each one is kind of different. So I can give an example if you’d like, or can also ponder the ad hoc nature of most of my work, which I also worry a little bit about.
Robert Wiblin: Let’s do both of those.
David Roodman: Okay. One example. I did a review a couple year ago here, at Open Philanthropy, on the impacts of incarceration on crime. Does putting more people in prison reduce crime? Or maybe does it even increase it? We have an active grant-making program in criminal justice, so this is a kind of due diligence in parallel with the actual grant-making program.
And I found 20 or 30 studies that looked at different aspects of this question. Most taking place in the United States. And of those, I was able to … I focused on ones that were relatively recent, situated in the United States, and tied to the prison boom, the mass incarceration boom that we’ve seen in the last few decades. [inaudible 00:22:36] from studies of sentences of two days. That was less relevant.
And then among those, I was able to obtain data and code or reconstruct data and code for about eight of the studies. In about seven cases I had some significant critique that I came to, and about four had actually kind of flipped my interpretation.
One that I think is relatively easy to explain looked at the impact of a law that was passed in the early 90’s in California called Three Strikes. Three strikes and you’re out. And it had escalate … it was a repeat offender law. Your first felony would get a normal sentence. If you then committed another felony, the sentence would be doubled. And a third one would lead you to 25 years to life. Very draconian. The felony had to be of a certain seriousness.
But you had people committing what seemed like relatively minor crimes, you know, drug dealing, what have you, that met a certain threshold who were then in prison for at least 20 years. The study was by Alex Tabarrok, who’s at the Marginal Revolution Blog, and Eric Helland, and they looked at whether people who had two strikes and therefore were right at the edge, if they committed another felony, of getting 25 to life, committed less crime than people who had just one strike.
And they did it in a smart way in order to improve the quality of that experiment, the comparability of the two groups. They said, “Let’s look at people who have … let’s only look at people who have been charged with two offenses in sequence that could be considered strike-able offenses, that add to your strike count. And then let’s look at people who actually got two strikes but also look at people who, on one of those trials, the judge ultimately convicted them of a lesser crime that didn’t add to their strike count.” So you had people who had two strikes and people who almost had two strikes, but maybe got lucky.
Robert Wiblin: But very similar people, hopefully.
David Roodman: Right. That’s the idea. And they do a number of checks to see whether these two groups are fairly comparable, and then they look at the difference in the recidivism rate. You know, how quickly do these people get re-arrested once they get out of prison? And they find some deterrence. People who had two strikes and were facing 25 to life got re-arrested about 10% less per unit time.
And so they shared the data with me on request, and I was … rather by accident, I discovered what I thought was a problem that changed my reading. I wanted to do a cost-benefit analysis in my paper, as I did, which required splitting out this impact by crime type. How much did murder go down? How much … or arrests for murder? How much did violent crime go down, and so on. Because those have very different ramifications when you’re doing a cost-benefit analysis.
And I discovered that the impact was entirely confined to drug crimes. Or the seeming impact. And, so one thing I thought was, “It’s debatable, what exactly is the social cost of a drug crime.” Maybe costly for you, as a consumer, but one can argue about whether that once you can factor that in to a social-level cost-benefit analysis.
The other thing I found is that when I reconstructed one of the tables that just looked at whether the two comparison groups were similar, statistically. Had the same age, did they have the same number of prior offenses. My table didn’t quite match the published one. I actually found that the groups were systematically different on the number of prior offenses. And so what it turned out was that people who had more priors got re-arrested more after they were convicted of that second crime and were released.
So, it seemed like there may have been a failure of experimental design. In fact, the groups were not comparable, and so there was just a continuity over time. People who got arrested more before got arrested more after. And so that really reduced my faith in the power of draconian sentences to deter crime.
Robert Wiblin: Yeah, I know Alex. How did the authors of that paper react to what you were saying?
David Roodman: Alex has been, I would say magnanimous. And I think he may be unconvinced. I think his reaction was … his first reaction was, “Well we never claimed that this is a perfect experiment.” Which is absolutely true. You know, we just did a number of checks. But I don’t think I shook his faith. He did blog my report publicly and he just said, “I could argue with this, but let me put that aside and just welcome the larger project that this represents.”
Robert Wiblin: Something that affects my interpretation is that I know Alex is not a law and order kind of guy. I think, if anything, he’s in favor of much shorter sentencing. So the fact that he found evidence in … that would point in favor of having longer sentences … I don’t think he would have been biased in favor of that for political reasons.
David Roodman: Yeah, that sounds right. I think of him as a libertarian, I don’t know if that’s right.
Robert Wiblin: Yeah, I think he would identify as reasonably libertarian. Interesting.
Okay. So you looked at … you tried to estimate the impact of prisons through deterrence, incapacitation while people are in prison, and the effects that prison had on people after they left prison, and their likelihood of committing further crimes. What was the cleverest thing you think you did when you were trying to estimate these three different things?
David Roodman: Well, I think where my energy went was into reconstructing the eight studies that I was able to get data and code for. And each one had a different story. There was one that was just flat-out obviously true, which was such a clean, simple one. It was the study of a mass release in Italy. Showed that crime jumped the next month in certain categories. Just incontrovertible. But other ones I found a lot of problems.
Some cases it didn’t actually change my answer once I tried to deal as well as I could with those problems, but one interesting one that did actually lead me to read the study differently had to do with two studies that use the same data from the Georgia prison system. The American state.
The studies looked at the decisions that parole boards make. So the way things work, and I think still work in Georgia, is that a trial, or the judge gives you a certain sentence. Let’s say three years. You may not actually serve a full three years. A parole board, depending on various factors, may let you out early, in which case you serve out the rest of your sentence on parole.
When you’re on parole, it’s limited freedom. If you look at the history of the idea of parole, even the word, I think it goes back to the idea that you’re leaving on your word of honor that you’re going to be well-behaved. And if you cross any lines, if you fail to show up for an appointment with your parole officer, or you fail a drug test that might be required, or you get arrested, you can be yanked right back to prison without trial because it’s all conditional.
And this study looked at whether, kind of like in the three strikes study I told you before, it made a clever comparison. Not an actual experiment but a clever comparison that was as close as possible to an experiment. In order to look at whether people who are let out of prisons sooner subsequently committed more crime, or committed less crime. And it found that actually there was a big effect, I don’t know, I think each month that your sentence was shortened. Your actual time served was shortened, increased by something like three or four percentage points, the likelihood that you would return to prison within the next three years. That’s a pretty big effect for one month. You multiply it, say, by 12 months-
Robert Wiblin: Hold on. You’re saying if you spend less time in prison, you have a higher chance of going back to prison?
David Roodman: Exactly. Right. So this was seeming to say that keeping people in prison longer is reducing crime. Yes. And I should say that that was an uncomfortable conclusion for Open Phil, and so that possibly that motivated to dig into it more. Although, as I say, I think my bigger bias is I’m just a contrarian in general. And the study just seemed very clever.
But then I realized that there was actually another story to explain the result, which I call parole bias. And I’ll try to see if I can make this clear. If you imagine two people who are identical, they committed the same crime, they have the same sentence. Let’s say it’s three years. One is required by the parole board to serve his full sentence. The other is let out a year early, okay? And then you ask, you look at whether they got re-arrested within the three years after their release.
The person who got let out early is going to spend the first of those three years in the follow-up period on parole, which, as I just explained, is a period of very heightened probability for re-arrest. Okay?
Robert Wiblin: Oh, because you could easily get re-arrested, right?
David Roodman: The smallest infraction, yeah. Actually, re-arrest is not the correct term in this case. What she measured was return to prison.
Robert Wiblin: Re-incarceration.
David Roodman: Yeah. That’s important, actually. And, so if you just think about that, what that’s saying is if under the system, if the parole board lets you out earlier, more of your three year follow-up is going to be in this period of exposure to very easy return to prison. And that itself could make it look like being in prison less leads to higher recidivism. Which is a point that I don’t think had been made before, and it was a source of an alternative theory for … explain the results. Explain why it seemed like less prison time was actually increasing crime.
Robert Wiblin: Surprising that the researchers didn’t think of this possibility. To me, anyway.
David Roodman: Yeah. I can’t speak to that. I did … the author here is Ilyana Kuziemko, who was pretty helpful in helping me get the data and code, and we did have some back and forth on this.
Robert Wiblin: And did you manage to confirm whether that was the actual explanation? Or was it, what is that you now have two competing explanations?
David Roodman: I did, as well as I could, test whether that could explain it. It wasn’t an easy thing to parse out. So I couldn’t conclusively decide either way, and it gets pretty in the weeds pretty fast if I say more, but I did some initial variants of the main regression where I tried to deal with its effect. May have overcorrected with it. But the results were consistent with this being the cause of the headline findings.
Robert Wiblin: Any other clever tricks that you want to discuss from that?
David Roodman: Oh, gosh. There’s another study of deterrence. We talked about one. Three strikes. The study was by David Abrams. Looked at the effects of state legislature passing gun add-on laws. So the idea is, if you commit a burglary without a gun, the sentence is X. If it involved a gun, it’s X plus Y. So he looked at whether in the months or the years following the adoption of such a law, crimes involving guns suddenly dropped or not. And looked across many states at once.
I wouldn’t … I don’t know if it was particularly clever but I was able to re-think the study and, I think, add value by going back to the underlying data source. The crime data all come from the FBI. And you can go to the FBI website and download, you know, number of gun robberies or whatever in each state, in each year, pretty easily. But the raw data are actually supplied by what are called law enforcement- or local enforcement agencies. LEAs. Which are, you could be the New York City Police Department, or it could be a much smaller [inaudible 01:22:23].
And they report monthly. And it’s a system of reporting that goes back at least to the ’60s, or probably far longer. And the data are messy, but it means that you can get not only much higher resolution geographically, but much sharper information on timing. Month, to month, to month, to month.
And that matters, because the law may have … a new law may have passed in March, and you can see whether there was a change in crime rates in April, or six months later, rather than just looking year to year, which is fuzzier.
So with a lot of effort, I was able to download all the FBI monthly raw data and then pull it together. It’s messy data, so that took a big effort. And then there were a lot of missing data points or bad data points so I had to develop some algorithms for identifying bad or missing data, and using a pretty fancy technique called multiple imputation to make guesses about what data would go there in a way that would not introduce false certainty by making unknown numbers look known.
It was … big number crunching monster, but the end result was graphs that had monthly resolution rather than annual resolution, and the seeming drop that was coincidental with the adoption of laws kind of disappeared and it just looked like a very smooth decline.
Robert Wiblin: How much time do you spend just trying to get data and code that you need to try to do a replication?
David Roodman: Usually, the coding will take longer than the data. Especially if it’s computation-intensive. That slows things down. But I’m trying different things, and, you know. It’s a coding project, you have to develop algorithms and you run things a bunch of ways. But that was a big part of … I mean, the incarceration and crime report probably took a full-time equivalent year, and the majority of that was the reconstruction of these eight studies.
Robert Wiblin: How often, when you just email the authors, are they like, “Well, here’s a spreadsheet”?
David Roodman: That is the exception, unfortunately. More often I get the data because, either the primary data is in the public domain, or the publishing journal required that it be posted somewhere.
Robert Wiblin: Is it getting better?
David Roodman: More journals are requiring data sharing. And I think maybe younger researchers, some of them are more apt to share the data. It’s tricky though, because they can understand the principle but they’re also concerned about getting tenured. And if I’m asking to reconstruct their data set, I’m a risk to them. And that’s a pretty important thing to them.
Robert Wiblin: Do they see any upside from you approving what they wrote?
David Roodman: Yes, but I don’t think it justifies the downside.
Robert Wiblin: Yeah.
David Roodman: So, for example, I’ve had someone who say, “I will happily share the data and code with you after I get published.” And I just have to accept that. And that’s not the ideal from point of view of getting to the best knowledge as quickly as possible, but it’s also understandable.
Robert Wiblin: It’s the system we have. So, what did you end up concluding overall on the question of longer sentences or letting people out of prison would raise crime?
David Roodman: Right, well, as you said, there’s the before, during, and after effects of crime. The before effect is deterrence. Does having tougher sentences cause people not to commit crime? And I’ve just mentioned two studies to you of deterrence, one on the gun laws, one on three strikes in California. Both of those, I ended up not believing, and came to the conclusion that there really isn’t much deterrence at the kinds of margins that we’re talking about here in policy discussions. Obviously if there was no criminal justice system, there would be more crime. I think that’s pretty clear. But at the margin we’re at, I think it’s safe to say that deterrence is essentially zero.
Then there’s the during effect. Does crime fall significantly when you imprison more people? And the answer is yes. There’s several cases where … like, here in California, even. We’ve had two criminal justice reforms and they went into effect in 2011 and 2014. And certain acquisitive crimes like motor vehicle theft have gone up in a way that seems pretty clearly connected with those reforms. Much less with violent crime, though.
Then there are the aftereffects, and this is where it gets most complicated. Being in prison, you could imagine, could reduce the amount of crime you commit afterwards. Maybe you get new … there’s jobs training, or you learn to read, or you’re helped off of your drug problem, or you’re scared straight by the experience. But it’s also easy to imagine that being in prison just makes things worse, long run. That you’re more alienated from society, that you’re more close friends with other criminals and you learn your techniques. You have less ability to get a real job because you’re marked as a felon.
So that can swing either way, and it could dominate the overall answer. The majority of the studies that I looked at that are set in the modern American context, putting extra weight on the ones that I could actually replicate and came out believing, say that the aftereffects are harmful. So yes, you get a short-term benefit when you put more people in prison, reducing crime. But in the long run that seems to backfire. Increasing, actually increasing crime when you get out.
So as a very rough estimate, I would say, we’re at a margin in the United States today where … by the way, we have huge numbers of people in prison. You know. Per population, we’re the highest in the world except possibly North Korea. We’re at a margin where incarceration is not affecting crime. The marginal effect is zero.
And then we did a really interesting thing. This was prodded in part by Holden and his concerns that I was biased in the direction that was comfortable for us. He had me come up with a devil’s advocate reading, and then we also did a cost-benefit analysis using both my favored reading of the evidence and the devil’s advocate. The devil’s advocate says, “Actually, there is deterrence and actually the aftereffects usually are beneficial.” So, putting people in prison longer actually reduces crime.
And the cost-benefit analysis is a whole … its own world, you know? What’s the dollar value of a rape? And there are different methods that people have come up with. To answer that, and there’s some answers out there that are usually used and of course they’re highly debatable. But I found that if we take the devil’s advocate reading, which is to say that decarceration will increase crime, and we use the highest valuations on crime, which, again, is in favor of the devil’s advocate. So we really put a lot of dollar value on those … the cost of that extra crime. That it came out about break even.
Each person year of … well, if we’re going to talk about incarceration, then I would say, “Each person year of lost liberty was a cost of about $92,000.” That’s dominated by valuing a year of liberty at $50,000 and then also the cost of prison. And that it was averting about $92,000 in crime. The numbers came out exactly the same, which should not be … that’s false precision. So even in the least-favorable reading … the least-favorable cost-benefit valuation of the least favorable reading of the evidence, it would break even.
Robert Wiblin: I’m surprised you didn’t talk more about crime in prison, which I feel would really push things in the direction that it sounded like you wanted to go. Because, I mean, prisons are just hotbeds of crime.
David Roodman: That’s an excellent point. I think my … it absolutely does bear mentioning and I should have mentioned it. Yes, if you count the crime that goes on in prison, that would presumably shift the calculation a lot. Putting somebody behind bars may therefore, just almost on the surface, increase crime.
I’ve de-emphasized it for two reasons. One is, there’s not much data on it. So I just didn’t know what to do with it. The other, I think, is my intuition that the audience I most want to persuade may not care. You know?
Robert Wiblin: They view it as part of the punishment, perhaps?
David Roodman: Right. Yeah, that’s part of … you know, if you didn’t want to deal with that, you shouldn’t have gotten yourself in trouble. And somehow I have to agree with that to feel like the more effective argument for reaching across to skeptics, is to de-emphasize that. But you’re right. It absolutely bears emphasis.
Robert Wiblin: How did people take this conclusion, given that it’s a potentially politically-charged issue? Were people persuaded?
David Roodman: I think internally it was accepted. There was no controversy here because it happened to be compatible with what we’re doing. I think we’ve talked to some activist groups on criminal justice reform who have been excited about the findings and want to figure out how best to communicate it and use it when they’re talking to legislators. Haven’t gotten a lot of pushback from skeptics, so either they didn’t bother reading it, or they thought it was okay. Probably they didn’t bother reading it.
Robert Wiblin: What character traits do you think are most important in a researcher?
David Roodman: You know, there are different kinds of research, and research, like any other field, benefits from diversity. We shouldn’t all be the same way and optimize on the same traits. So, I can reflect on what has made me useful in my way, in my distinctive way. But I wouldn’t suggest that that’s what everybody should aspire to. It’s more about figuring out who you are and how you can contribute.
I feel a strong desire to get to the bottom of things in order to reduce the chance that I’m wrong. I have aptitude with quantitative things and coding, and those are all very useful. I’m interested in hearing and synthesizing different views, whether about methodology or much broader questions. So, I have sort of a pluralist instinct in that way.
But there are lots of great researchers who contribute by being less interested in what other people do and just pursuing their own genius with aggression.
Robert Wiblin: So, a couple years ago you worked at the Gates Foundation and then moved to the kind of GiveWell/Open Phil cluster that you’re helping now. How do you find that the two compare, given that the Gates Foundation is, I guess, has almost 60 times as many staff.
David Roodman: Well, maybe I should first explain how I ended up at Gates, because people may be interested in career moves. I was at the Center for Global Development for 11 years. When I joined there, it was a similar experience to earlier in my life. I knew I had some interests and some aptitude but really wasn’t sure how I would be useful. And Nancy Birdsall, the president, was kind enough to hire me and find ways to use me over time.
And it was there that I first discovered this interest in replication. But I think one thing that I lacked was that I’d never worked in a decision-making organization. Not an aid agency, or any other part of the government. Not a philanthropy, not a business.
And I think if you’re interested in policy-relevant work, like say, working at a think tank, it may be very productive to move back and forth between the think tank and a more practical setting. Because when you’re in the practical setting, you don’t have as much time to think but you encounter lots of questions that you wish somebody was figuring out. And then when you have some space to actually think and research, then you have the inspiration that comes from that really practical experience.
And I don’t think I had that. And so after 10, 11 years, I’d finished up my work on microfinance, and was having a lot of difficulty motivating myself around a new topic. I felt like I should be able to figure out what’s a valuable way to deploy my time, and I’m struggling. And … some point realized it was time for me to go. I was no longer growing.
And, decided I should work in a more decision-making institution. Went through a job search process and ended up at the Gates Foundation office in DC … where I lasted six months. I’m not ashamed to say that I was fired. And I probably shouldn’t go too much into what happened, it wasn’t like there was some dramatic story. But it clearly wasn’t a good fit. It’s a very big place. It’s like 1,200 employees I think? Last I heard. Plus a lot of contractors. And it’s giving away, I don’t know, four billion dollars a year? Something like that. Whereas I think Open Philanthropy might be up to $100 or maybe even $200 million.
Robert Wiblin: I think it’s about $200 million.
David Roodman: So it was giving away about 20 times as much, but with far more than … I don’t know, what would it be … far more than 20 times the staff.
Robert Wiblin: 60 times, or something.
David Roodman: This is a very lean place here. So it’s a large organization with hierarchy and various teams, and we talked about politics earlier. To do well there, you have to know how to work well in a very complex social structure. I don’t think I ever really learned that. I always lived in small organizations. And part of that is about understanding that speech can both be about getting to the truth, when you’re talking about some substantive topic.
But it can also have political implications. Maybe a disagreement won’t just be taken as a …
Robert Wiblin: Factual issue?
David Roodman: Factual issue. But it can be felt in another way. And I think I just wasn’t thriving there in the way that I needed to.
Robert Wiblin: Have you since kind of tried to learn those skills? Or are you just trying to find organizations where it doesn’t matter so much?
David Roodman: That’s a great question. I think mostly I have failed to improve in that way.
Robert Wiblin: [crosstalk 00:31:29] Do you think maybe that’s a virtue in a lot of cases? That if you learn how to do politics, then it would infect your research approach?
David Roodman: I think it is a virtue in my case, but it may be a luxury that I don’t have to think about it. You know, a broad thought that I have, having met many impressive people in DC over the years, is that people’s great strengths, or their strengths are also their great weaknesses. They’re often the same thing. It’s just whatever’s most distinctive about you is really useful in some contexts and really a problem in others, and we’re fortunate, people like you and me, and people listening, to have a lot of autonomy in life, and we’re not all just bound to be rice farmers.
And so we’re fortunate enough if [I 00:32:12] try to find our place in life, where what is distinctive about us is more often a strength than a weakness. So, I feel like I’ve had the luxury to not learn to be a very good politician, and that’s working out for me because I’ve managed to find places where it doesn’t matter as much, or it actually would be a detriment, even.
Robert Wiblin: I think that’s the question that’s come up at 80000 Hours, is how much to try improving on your weaknesses versus just moving to somewhere that weaknesses don’t matter, or they even look like strengths. And I think the research we’ve read suggests that people can change their character somewhat, although it’s quite a gradual process, so they can, over a decade, get rid of their weaknesses with quite a lot of effort. But it’s not easy and you can’t do that on too many things at a time. Whereas you can potentially move location quite fast to a place where, you know, having low conscientiousness maybe, or being a bit too outspoken are not such big problems, and your strengths can shine through. Do you have a view on that? Sounds like you’re in favor of the moving rather than changing.
David Roodman: Oh, I don’t know. I mean, what you just described sounds very plausible to me. And I hear it mostly as a potential source for self-criticism. In other words, I haven’t tried to improve myself that way. Maybe I could have, and maybe my reluctance to try to improve is a fault. It maybe reflects my stubbornness, right? My resistance to my feedback from others. So, it sounds like, I mean … what is implied, what you’re saying, is one should do both, I think. Lifelong growth is a great thing, and especially, one gets older, there’s a strong temptation not to push yourself to change. It seems good to resist that. But you also have to approach yourself with humility and recognize you can only push yourself to change so fast and in the meantime, if there are cheaper ways of finding ways to [inaudible 00:34:01] being happy, you know, go for it.
Robert Wiblin: Alright, let’s talk a bit more about your research approach. So, before you’ve even chosen what question to research, I mean, how do you figure that out? Do you often kind of turn down projects because you don’t think you’ll be able to make a good go of it?
David Roodman: That’s happened occasionally that I’ve turned something down because I just don’t feel like I can contribute much. But what I like about, really like about being at GiveWell and Open Phil is that people come to me with questions that have practical relevance for decisions that are being made, or, you know, conceivably could be reversed.
Robert Wiblin: Right.
David Roodman: And that in itself is inspiration. I talked to you about how I’ve lacked that kind of inspiration that comes from practical experience, where here I get a sense I get it. A topic like the impact of incarceration on crime looked really boring to me. But once that I knew it actually mattered for things we were doing with real money, that was the motivation to get into it. And almost anything is interesting once you get into it.
And I should say, I don’t see myself as somebody who can only do research reviews. That’s what I’ve done here, out of a kind of comparative advantage-type argument. But at the moment actually I’ve pivoted to something new that is very raw in my mind, so I probably can’t speak about it very clearly, which is to participate in the internal discussions that we’re having about what we call cause prioritization. The philosophical issues that come up when you think about how much to put into animal welfare, versus taking care of people. And how much to worry about problems today versus the far future. And that’s not nearly as quantitative a set of questions, but I still feel that I can contribute in the same spirit of seeking out lots of different views and trying to synthesize and think critically.
Robert Wiblin: When you start a new project, can you kind of walk us through the process that you’ll go through? Do you start by trying to collect the data, or do you read a lot, broadly, about the topic to kind of situate yourself in it?
David Roodman: It’s a very organic and ad hoc search process, usually. So, typically when somebody comes to me with a new topic, they’ll say, “We read this.” Or, “We’ve got this paper we think you should look at.” And that’s enough to start exploring the network. You read that, it cites other sources. In some cases you Google authors’ names, or even talk to them. So, it’s not very structured, really. And I sometimes worry about that.
I think … my most recent replication was of a paper on the impacts of the deworming campaign in the American South, about a hundred years ago. And that one … actually, and then, it was done by an economist named Hoyt Bleakley. And he did a companion paper using similar methods and some of the same data, looking at the impact of malaria eradication.
And especially in the companion paper- I replicated both. I made an effort, for the first time, to pre-register. To say, “Here’s what I intend to do,” and then put that on a third-party website that could prove when I had submitted the document. And that was out of a sense that I need to start becoming more conscious of what I’m actually doing, and then there are potential for biases to creep in if I don’t do that. But to date, it’s been pretty informal exploration most of the time.
Robert Wiblin: What kind of biases would you worry about?
David Roodman: Well, this came up … this was some good feedback I got from Holden, Holden Karnofsky, who’s the director at Open Phil and is my boss, when I was working on the impacts of incarceration on crime. He wanted to know, how did I choose which studies to really dig into and try to replicate? Because he worried about bias. And he was mostly worried about bias of the kind that would make my results comfortable. He wanted me to make sure that I was doing everything I could to make us uncomfortable, because that’s where the value of this work comes from.
And I realized I didn’t have a great answer for him, and that he was probably right. There were certain studies that I was more skeptical of because they came to a conclusion that would challenge Open Phil and its priorities, and so I was more apt to dig into those. Although, to be honest, I think a bigger bias I have is against anybody who claims to have a really statistically significant and large result from non-experimental data, especially. I’m just sort of a contrarian. I think I’m more of a contrarian than biased one way or another on a lot of these issues.
But I realized I didn’t have a complete good answer for him, so then when we were revising the document, I made a second round of attempts to get data and code in a more systematic way.
Robert Wiblin: You mean to choose which papers to scrutinize more at random?
David Roodman: Yeah, well, you know, I had already replicated, I don’t know, six or eight of them, and I then wrote to authors of other papers that were in a certain sampling frame that I described earlier. You know, set in the US, not too long ago, focusing on margins of punishment that are relevant for the mass incarceration debate. Not, you know, many years, not many days. Which ended up not yielding any more studies, as it happens, but it was an education for me that I need to be better able to explain the path that I’m choosing.
Robert Wiblin: Do you start writing early, or do you kind of spend a lot of time playing with the data before you put pen to paper?
David Roodman: I would say I do not spend writing … start writing early. I think it can be a good discipline. Worked with a guy years ago named Alan Durning who’s at … founded The Sightline Institute, which actually I think we fund, in Seattle. And he always said, “The first thing you should do when you’re embarking on a long project that could take a year of writing, is write the press release.” And probably once you get to the end of the project and you’re actually ready to launch it, you’ll completely trash that press release, but it can be really good for helping you focus on what bottom line. What are the key questions that you’re trying to get at?
Robert Wiblin: How do you know when to stop?
David Roodman: I think it’s synonymous, in a way, with the question of, “How do I make judgments?” Because once I reach a judgment, then I feel like it’s okay to stop. That doesn’t mean I can’t learn more, but it’s an important turning point. Actually, I’m thinking aloud here. I’m not even sure I believe what I just said, because in a lot of cases, you just have to make the best call you can with the data you have. You always have to do that. It may not be as much as you’d like.
I’m not sure. I think we all go through processes of trying to figure things out, and at some point we get to a sense … a point where we have a settled understanding. It may subsequently evolve, but it’s a mature thing that’s ready to be shared with the world.
Robert Wiblin: Has anyone written a guide to doing what you do, or is it somewhat distinctive?
David Roodman: I’m not aware of anything like that. We just had a very informal meeting here at Open Phil where I was asked to speak for a few minutes on what I look for when I’m reading research, and I struggle with it because a lot of what I end up doing with the study is very specific to that study. There are some principles, but I have not seen them written up in any way.
Robert Wiblin: Okay. So if it’s hard to generalize, we should dig into some specific analysis that you’ve done, I guess. Figure out what methods you used in each one. Let’s tackle the geomagnetic storms topic first, which I found particularly interesting. So, in 2015, and then again in 2017, you looked at the risk of geomagnetic storms messing with the electrical grid and, I guess, other electrical equipment. So, yeah, what approach did you take there? And feel free to go into as much detail as you want.
David Roodman: So, a lot of our interest at Open Phil is in existential risks, and there are many of them, as I’m sure many the listeners already know. A few years ago … actually, when I was working as a consultant, before I became an employee, I was asked to dig into this one question of geomagnetic storms. What happens is, you know, and this is obvious I’m not a physicist. I’m a statistician more than anything else. So I don’t understand a lot of what I’m about to describe.
There are these big cataclysms on the Sun. And, they cause the ejection of coronal matter, which then gets hurled away from the Sun and might collide with the Earth. It is typically magnetically charged, that is, the particles are systematically magnetically-oriented, and so it’s like this little magnet coming and clobbering the Earth.
And smaller versions of this are what cause the Aurora Borealis. And, as with the Aurora Borealis, and I guess the southern one is called the Aurora Australis? You told me-
Robert Wiblin: I think that’s right.
David Roodman: The Earth’s magnetic field actually channels the material towards the poles, and so at high latitudes is where you get the impact, and it’s a little bit like, you know, if you do a cannonball or you know, you drop something huge into the water. It creates a lot of turbulence disturbance. What happens is, that the local magnetic field, especially at high latitudes, will start to oscillate in kind of random but high amplitude waves.
Changing magnetic fields, in turn, induce electrical currents in any wires that happen to be nearby. One scenario that people have been worried about is that a really big storm could induce really large currents in long-distance power lines, which would fry the transformers that are at either end of these. These are what change the voltage.
A dam might produce power, I don’t know what, a hundred volts or whatever. And then that gets stepped up to a much higher voltage like 765,000 volts or even a million volts for long-distance transmission because that reduces the energy loss. And then there’s a transformer at the receiving end which then steps down the voltage again. The little boxes that you use to charge your phones and computers, those are transformers. They’re converting the oscillating current in the wall into the direct current that your computers and phones need.
But there are also transformers that are as big as houses, and these could get fried. These could just get destroyed in seconds, maybe. Or minutes. Which would then cause blackouts. And the worry is that this could happen over a very large area, continental-scale area, and then we would have to replace hundreds of these giant transformers. In the meantime, there would be large-scale blackouts lasting for months. There are not a lot of these … not a lot of spares around. These are custom-built, huge things. And they can take months, each, to manufacture. A long-term, large area blackout could be an economic and humanitarian crisis. Because if you don’t have power, then you can’t … maybe your pipeline shut down, maybe the hospitals don’t work, et cetera.
Robert Wiblin: Can’t move food around.
David Roodman: Can’t move food around. Yeah.
Robert Wiblin: Can’t store food.
David Roodman: Right. All these systems that depend on each other and power could collapse. So it’s pretty scary. And there’s a couple authors, who, in particular, their work on this possibility have been cited widely in the press, and were getting attention.
I dug into it and did my best to understand the physics and the astronomy. And then gravitated to a statistical aspect of the question, because that’s where I could make the biggest contribution. What I looked at was what the history of these events allows us to say about the probability of more in the future.
So, there are different ways of measuring the magnitude of a geomagnetic storm, and depending on how you measure it, the data are available for 20 years or 50 years. One measure is, complicated reasons, when a storm hits, it actually reduces the strength of the Earth’s magnetic field at the equator. So you can collect, minute by minute, magnetic data at two magnetic observatories. And that is done, and there’s actually four observatories that are used to construct this particular index, which is called the Disturbance Storm-Time Index. Dst Index. And that thing gives you a number to represent geomagnetic disturbances.
We have that going back to 1957, and we can model that and ask, “Well, what is the probability of there being a storm in the next decade, of a certain magnitude or higher?” The example that everybody worries about was a big storm that hit in 1859. Now, of course, in 1859, there wasn’t much power structure to worry about. Apparently, a few telegraph operators were electrocuted, and there were spectacular auroras quite far towards the equator in both hemispheres. People wonder what would happen if we had a storm that big again today.
I came away with a kind of paradoxical message. I think that the people whose work got the most attention were exaggerating the risk. If you just do the analysis, and I can explain in specifics. They were overestimating. I should qualify that. It’s not that they were exaggerating the risk, because the risk is unknown. But their extrapolations from history were not well done, and were overshooting.
On the other hand, there’s so much we don’t know. This is a pretty under-researched area. And this is an area where we actually can learn more. Economists often struggle to figure out whether inflation causes growth or the other way around, and it’s just sort of a thing that goes on forever. They can never figure it out.
But with money, we could figure out more about how these kinds of storms affect actual transformers. That’s an actual research program that just can be done. So there’s a real opportunity to learn more and reduce our uncertainty.
I came to be persuaded that this big storm in 1859, called the Carrington Event, was probably only at most twice as big as storms that have occurred say, since 1950. Which doesn’t sound that scary. The biggest event we’ve had since 1950 was 1989. March of 1989, there was a storm that caused a blackout in much of Quebec. It destroyed a couple transformers. But, within about 12 hours, the power was restored. It was hardly a catastrophic event.
So, 12 hours of power loss in one part of Canada. If we then double storm strength, should we expect a totally different level of impact? I would cautiously say that that seems unlikely.
Another confusion was that, because there’s a lot of turbulence … if you imagine looking at the ocean during a storm, you can see ordinary waves. But then if you look at any individual wave, you’ll see smaller ripples, and so on. It’s kind of fractal. I think that’s a good visual metaphor for what happens one of these storms hits. There’s a huge amount of local spiking, and it’s very tempting to say, “Well, the largest spike in magnetic field that we saw anywhere on the Earth that was measured is X. So now let’s assume that a storm could cause X everywhere at once.” And that’s not an appropriate extrapolation. It’s like assuming … imagining that every place is as high as Mount Everest. But that kind of fallacy was embedded in some of the most scary analysis.
My overall take was if we extrapolate from the historical record, which is short and shouldn’t be over-relied upon, that the chance of an event as big as the Carrington Event of 1859 recurring was about 0% to 4% per decade.
One caveat I would … and as I say, that event in itself didn’t seem … doesn’t sound to be so scary because it’s twice as big as events that civilization shrugged off very easily. I think the biggest caveat is that I learned about some research just as I was finishing up that looked at tree rings from trees in Japan, I think, and found that there was a very sharp jump … I forget. I think it was probably in an isotope of carbon dioxide, but I’m not 100% sure, in the tree rings between a couple years in the 700s … going by the Western calendar. And again in the 900s. It’s more than a thousand years ago.
And the best explanation for those giant jumps is apparently extraterrestrial radiation, conceivably from another galaxy or another star, but probably more likely our own Sun. And this would imply a solar flare, I don’t know, ten or twenty times bigger than anything we’ve witnessed in modern history. But solar flares are not the same thing as geomagnetic storms. Solar flares are huge outputs of pure radiation. They may be associated with ejections of actual coronal matter, which is what we’re concerned about. But the association is not well understood, and in particular it’s not clear whether a solar flare ten times as big as anything we’ve witnessed in modernity would lead to geomagnetic storms ten times as big.
Robert Wiblin: I just want to understand the engineering aspect. Is it that the transmission lines are very long, so that they kind of pick up a lot of the magnetic change? And so the longer the cable, the worse it is?
David Roodman: That’s right. Changing magnetic fields. If you’re in a particular spot and the magnetic field around you is varying, that induces a voltage where you are. This is one of the principles that makes motors work, and generators. We measure electric fields in volts per meter, or volts per kilometer. So if you have a wire that is running many, many kilometers and it’s immersed in this very strong electric field, then yes, that will multiply the effect and induce a larger current.
Your question about the engineering, though, reminds me of another interesting theme in this. One voice in this discussion that is some engineers at ABB. ASEA Brown Boveri, I think it is, which makes a lot of these big transformers, and I think because of mergers, is now sort of retroactively the maker of the majority of transformers in the United States. So engineers there have put out papers saying there’s really nothing to worry about. And we have to take what they say with a grain of salt, because they’re basically saying, “Our products are great.”
Robert Wiblin: Wouldn’t they want to say, “Our products will break. You’ll need spares.”
David Roodman: Yeah, you’d think that. But I guess it cuts both ways. I don’t know. I remember testing that idea. I don’t know, I think maybe if they say that-
Robert Wiblin: Everyone’s transformers are bad.
David Roodman: Yeah. If you admit your products are bad, then obviously then that may push people to go to the competitor. [inaudible 00:51:20] competitor keeps its lips tight.
So anyway, we need to take what they say with a grain of salt, especially because they’re using models that they won’t share. So they make claims that, “We’ve done simulations and everything’s fine.” But they won’t really let anybody else check that.
Nevertheless, I found there was an interesting argument in what they said, that I couldn’t dismiss on principle. Electrical power grids have all sorts of components that are designed to regulate the waveform of the power, the exact frequency, keep the waveforms from different generators in sync. All sorts of machinery to keep this very complicated thing working just right. It’s really extraordinary precision over large areas. Or, if they can’t do that, to shut the system down.
So what they’re saying is that if there’s a big geomagnetic storm, it has two main effects. One is that it almost instantaneously starts disrupting the flow of current, in the waveform of the current. The other is that, over the scale of say, 20 minutes, it will start to pour energy into transformers and heat them up, causing damage. But those are two different time scales. You know, milliseconds and minutes. And they’re saying that there’s a lot of safety equipment in place that will automatically shut the grid down if things get too disrupted.
And so the result could be a very large and quick blackout. But it actually protects the system. So, with short-term fragility comes long-term resilience. So what we may have to actually wore about more is not the really massive events, but smaller events that damage transformers but don’t disrupt the power flows enough to trigger the safety mechanisms. And there’s some actual evidence that that’s happened, for example, in South Africa, which is closer to the equator and therefore doesn’t experience the storms as strongly.
So that, if there were a big storm, maybe the longest-term damage would actually be in places that are closer to the equator, less affected by it. And what would happen is, over, say, the next year or so, a lot of their transformers would shut down. Which, to me doesn’t sound like an existential risk.
Robert Wiblin: So is it the entire transformer that gets broken? Or is it just some piece that we can stockpile and then replace later on?
David Roodman: Well, now you’re pushing … I don’t really know that well. My intuition is, the damages can be pretty severe.
Robert Wiblin: Is it is an explosion? It’s getting hot and breaking?
David Roodman: Apparently, there have been explosions. Yeah. You know, the key components of a transformer are a magnetic core, and then lots of lots of wire wrapped around it. And then it’s immersed, usually in oil, for cooling purposes. And so when these things overheat, you can … I suppose the cores, the actual magnetic cores are not that harmed. But all the wires, their insulation could get burned, the wires could melt together, the oil can catch fire or absorb impurities. Sounds like a pretty big repair job.
Robert Wiblin: So what was the biggest challenge there with this research project?
David Roodman: The interdisciplinary nature of it. I studied a bit of … about electronics when I was a kid, so I had some background and some understanding of how electricity and magnetism are connected together. But I was certainly out of my depth and going beyond those basics, or trying to understand solar physics, or what have you.
And so it was trying to understand enough of the literature that I could say the kinds of … make the kinds of summary statements I’m making to you. Or even engage in conversations with the guys who actually knew this stuff. Which I did. Even that, you need a certain level of understanding.
Robert Wiblin: Did you have to learn or maybe even invent some new methods to reach a good conclusion?
David Roodman: No, I didn’t invent anything new, although it was another case of my wanting to implement a particular method to do the statistical extrapolation that wasn’t easily done in Stata, which is the statistics software that I know. So I ended up writing a program and going beyond just what I needed and writing a general purpose program that is now shared and ultimately getting very immersed in one particular technical question about how to construct confidence intervals, which led to an obscure and separate academic paper.
Robert Wiblin: So, I thought you’d say that the biggest challenge was that there just aren’t many historical events so it’s hard to know what’s the likelihood of an extreme event going forward when we just haven’t … we’re trying to predict the frequency of something that’s never happened, or at least not the last few hundred years. How did you get around that issue?
David Roodman: I would say that I did not get around it. I did the best with the modern observatory data that’s, you know, observed every hour and provides the basis for good statistical analysis. And then I zoomed out and acknowledged that there are longer-term dynamics which remind us that there’s a lot that we don’t understand and things could change more than the brief historical record that we have … would suggest.
You know, the question ultimately is, “What should we do?” That’s the important question. We don’t have to have complete understanding of the underlying … the physical reality in order to come up with a good answer for that question. My answer was that we shouldn’t panic, that there is some exaggeration here, but despite my more reassuring conclusions based on limited data, we can’t rule out some serious tail risk. And there’s a real opportunity here to improve our knowledge.
Robert Wiblin: Would it be expensive?
David Roodman: Compared to the stakes, no. Whether Open Phil would consider it to be expensive, maybe. It’s not cheap. What you really want to do to understand better how transformers are affected by these storms, is you want to have an actual, full-size transformer supplying, shall we say, a small city, and then being inundated with these additional large and volatile currents. But there aren’t a lot of spare small cities around, you know. It can easily, you know, I would imagine, run into tens of millions of dollars to run realistic field experiments.
Now, if you’re concerned about the fate of the global economy, that’s nothing. The question [crosstalk 00:57:03] might be daunting even for Open Philanthropy.
Robert Wiblin: But who’s going to pay for it?
So you mentioned, kind of, the fat tail-ness of the distribution. I guess we have a reasonable sense of the frequency of common, probably small geomagnetic storms. Can we then kind of extrapolate? Just say, “Well, it’s not going to be a normal distribution but it’ll be a power law or something like that,” and from that we can figure out the frequency of something that’s never happened before?
David Roodman: Yeah, that’s an idea that I develop in my report. Probably a lot of your listeners are familiar with at least the rough idea of the central limit theorem in statistics. This is a really key result. It says that, for example, if you were to conduct the same presidential poll, the same moment in time. Maybe you did that poll a thousand times. You would get a slightly different answer each time, each run of the poll, but your answers would cluster around the true value and they would do so in a pattern that follows a bell curve. That’s also called the normal curve.
And that’s true regardless of the actual underlying distribution of views in the world. Almost every case that we can imagine, you get a bell curve when you repeatedly sample. And that’s a really powerful result because it means you can start to construct confidence intervals while remaining ignorant of the underlying distributions of the things you’re studying.
We can do something similar when we’re looking at extreme events. It turns out that, you know, we don’t know what the true statistical distribution of geomagnetic storms is. Some people have argued that it’s kind of a power law, something else. Turns out that, when you look at the tail of the distribution, you know, the way it’s sort of gradually coming down to zero and flattening out. Most tails are the same. That is to say, they fall within a single family of distributions. It’s called the generalized Pareto family. They vary in, you know, whether they actually hit zero or not, and how fast they decay towards zero. But they kind of look the same, and regardless of what the rest of the distribution looks like.
So what you can do is you can take a data set like, all geomagnetic disturbances since 1957, and then look at the [inaudible 00:59:09] say, 300 biggest ones. What’s the right tail of the distribution? And then ask which member of the generalized Pareto family fits that data the best? And then once you’ve got a curve that you know … you know for theoretical reasons is a good choice, you can extrapolate it farther to the right and say, “What’s a million year storm look like?”
And one also has to be careful about out of sample extrapolations. But I think it’s more grounded in theory, this is, to use the generalized Pareto family, because it is analogous to using the normal family when constructing usual standard errors. Than, to, for example, assume that geomagnetic storms follow a power law, which was done in one of the papers that reached the popular press. So there was a Washington Post story some years ago that said the chance of a Carrington-size storm was like 12% per decade. But that was assuming a power law, which has a very fat tail. When I looked at the data, I just felt that that … and allowed the data to choose within a larger and theoretically motivated family. It did not, the model fit did not gravitate towards the power law.
Robert Wiblin: This kind of log-normal or normal curve, or power law, are they all special cases of this generalized family?
David Roodman: Their tails are.
Robert Wiblin: Okay. Tails.
David Roodman: Like if you were, you know, if you take the right-most 1% of the right-most tenth of a percent, they will more and more closely approximate a member of this one particular family.
Robert Wiblin: How much do you rely on interviews with experts in your research in general?
David Roodman: I don’t rely on them as much as, I would say my colleagues do when doing other work that is published at Open Phil. There’s a lot of interviews that are done here, and the notes are printed up and so on. But I do very much value when I get to a certain point and I think I’ve got a new understanding of some question, but I’m not confident in it yet because I’m new to the field and the ideas are new to me. I love being able, at that point, to call up an author and test my understanding. And very often, you know, my understanding gets reversed or I get pointed in new directions.
Robert Wiblin: Why don’t you think Open Phil has given many grants to deal with geomagnetic storms?
David Roodman: I should know the answer to that. I think the decision was made when I was still on a consulting basis here, and so I was on the outside. And so I’m not sure. We have made one grant to a researcher whose work I mentioned before, in South Africa. But that was not part of a systematic effort to take on this area.
I think probably people became convinced that other existential risks looked bigger. We’re doing a lot of work on, you know, pandemic preparedness and bioterrorism preparedness, and also we’re looking at AI safety, a couple other areas.
Robert Wiblin: I guess one thing is that a geomagnetic storm wouldn’t affect the whole globe all at once, right? It would just affect some part of [inaudible 01:01:51].
David Roodman: That’s a good point, yeah. It doesn’t literally seem to represent an existential risk. Certainly a catastrophic risk, but.
Robert Wiblin: Are there any careers you’d like to encourage anyone to go and work on in this area?
David Roodman: Well, if you have an aptitude for engineering, yeah, we definitely need more research because I just think there isn’t much attention being paid to this, and the stakes are potentially quite large.
Robert Wiblin: All right. Let’s move on from the geomagnetic storms to talking about research you did on the impact of deworming. Trying to figure out whether it really does improve child health and test scores and things like that. That’s been quite a source of controversy, as informally known as the Worm Wars. What did your analysis add to all that?
David Roodman: Over the course of, I think, well, a year and a half or so, I replicated and reconstructed most of the studies that look at the long-term impacts of deworming. So we’re talking about distributing pills, primarily in schools, to all kids, whether or not they have worms, because it’s just cheaper to give them and we believe the side effects are essentially zero, without actually testing whether they’ve got worms, and doing that, say, twice a year in areas where worms are endemic.
There are lots of studies of the short-term impacts on body weight, height, these kinds of things, within, say, six to twelve months. But many fewer of the long-term impacts. But for our cost-benefit analysis and thinking about whether to recommend deworming charities, of course, the long-term matters a lot. Effects over ten years are ten times as important as effects over one. And there, we’ve only got four or five studies.
So I’ve looked at most of those. Not all, yet. The big story is that I have undercut a couple of studies, but not the key one that has brought the most attention to this intervention and that we have been using in our cost-effectiveness analysis.
So 20 years ago, Ed Miguel and Michael Kremer co-authored a paper called Worms. They were economists, so it was in the economic literature, which was based on an experiment run in Western Kenya, of deworming. And they looked, initially, at short-term impacts, as you might expect, and they found that the dewormed kids went to school more. I don’t think their test scores improved, but school attendance jumped. And then they got more funding to follow up longer-term on the same kids, and they’re continuing to do that. I think we’re providing some funding for that.
And that’s really fantastic, just be able to see the effects 10, 15 years out. And so one of the longer-term studies, I think, it goes out about maybe not quite 10 years, has been the key one in our cost-effectiveness analysis that you can find on our website. So it’s follow-up on the original experiment and the key question is, can we trust the original experiment? Was it a proper, clean experiment? And I tried really hard to take it apart, to attack it. But in the end that I had to concede that the study won. One concern was that it wasn’t actually a randomized study. There were 75 schools in the study and they sorted them by, I think, province and then district, and then by the number of kids in the school.
So they had this sorted spreadsheet in Excel. And then having sorted them, they numbered them. We were actually three … it wasn’t a two-way experiment, it was a three-way experiment. Some kids immediately got deworming, some kids it was delayed a year and some kids it was delayed until the experiment was over. So then they numbered the list: one, two, three, one, two, three, right down. And that was how they assigned the groups. So they were assigning in part on how many kids were in school. Which is a little bit worrisome, and it wasn’t randomization.
So I tried to look at, is there any kind of what’s called statistical imbalance. Do these groups look statistically different? And I even went so far as to figure out exactly where these schools were, using some data that they accidentally made it viewable on the internet and didn’t want me to see, and which I’ve kept confidential. Figured out their exact locations, and then used that along with Google Maps to figure out their elevations, which is actually important because elevation has a lot to do with how much … how bad the worm problem is where you are. Higher elevations are going to have less of it because they don’t flood as much, basically.
So I had a new variable that was external to the study and that I could look at, whether these three groups were statistically same on this variable that was not something that the authors could have manipulated their results with awareness of. And I had to concede that even here, there really just wasn’t much sign of imbalance. So I came away, more or less saying I had to believe in the worms study and in the follow-ups that we use.
However, there’s been a few other studies … well, I’ll talk about one other study that was also reinforcing our faith in deworming, which I have now come to strongly question. It was not randomized. I think I mentioned it already, in fact. It’s by Hoyt Bleakley, of deworming in the American South. The reason it was compelling was that he seemed to show some very sharp jumps over time in schooling rates of kids after the campaign, and then also when they reached adulthood, he seemed to show some nicely and sharply-timed increases in their earnings.
That lined up really well with the research from Kenya, where we saw the same thing. Higher school attendance quick, right away, and higher earnings long-term. And the particular way it was done with this kind of seeming sharp jumps also made it pretty convincing even though it wasn’t a randomized experiment.
But with the help of research assistants, I rebuilt the original census data, and actually there’s data from other sources as well. Big project because he took a lot of data from hundred-year-old books, and had to be typed in manually, and there was one data point for each county in the South. Like a thousand counties. Lot of work to pull the data together.
And I, in my closest replication, I just didn’t see a clear sign of a sharp jump that appeared consistently through the different runs. And then when we expanded the data set, because more and more census data is being digitized, so maybe he had only a, I don’t know, 1% sample for 1910. Now we have a 100% sample.
When I expanded the data, any suggestion of a sharp jump where we would expect it if the campaign was the cause, was further smoothed out. And so what it looks like is that there was long-term convergence both within the South and between the South and the rest of the country on outcomes such as amount of time spent in school and adult earnings. But nothing, no sudden jumps that would be easily attributable to the deworming campaign.
Robert Wiblin: So how did you get the data you needed to do these replications?
David Roodman: In that case, it was hard work of … I did some of it, and research assistants did other parts of it, of hunting down old books. Some of them were in Google Books. Some of them were in my neighborhood library, which is the Library of Congress. Which is very fortunate. And we just had to scan, photograph, type in, do error checking where we could.
And then the census data comes from a fantastic project called IPUMS. I-P-U-M-S, which, I won’t try to figure out what it stands for, where they are digitizing more and more census data, not only from the United States but from other countries, and providing a really great interface that allows you to choose the years you want and the variables you want and download it.
And that’s all built … brought together in a Microsoft SQL Server Database. And then from there, once the actual data tables that we need for analysis are synthesized, that is then exported to Stata for analysis.
Robert Wiblin: So it sounds like GiveWell’s support of deworming then falls mostly on just one paper from the ’90s? That sounds concerning, right?
David Roodman: Yeah, I worry about it. What’s happened as a result of my scrutiny is that our research base, which seemed kind of reassuring … I haven’t talked about all the studies. There are two or three others, has thinned. And so we’re basically relying on this one experience in Western Kenya. And question is, “What do you do with that?” Right?
My impression is that in the public health world, world of medicine, you erect certain threshold tests, and so you say one study is not enough, or p-value on this study is not below .05, therefore we reject it completely. So we’ve gotten into some debates with people who come out of public health, especially in the UK, who just think we’re crazy. “How can you recommend this intervention if you’ve got one study that’s saying it’s got positive effects and others that say it’s indistinguishable from zero?”
I think a correct formal answer, I’m not sure if it’s ultimately practical, is that we need to be Bayesian about it. We’re in a situation where indisputably, the evidence is weak. I think the definition of weak evidence is that your priors matter, all right? If the evidence were compelling, it almost wouldn’t matter what we thought before we came into the experiment, and we’re not in that situation.
But suppose we draw some bell curves representing our general understanding of the impact of deworming. For the worm study in Western Kenya, that bell curve would be to the right of zero. A little bit of the tail would be on the left of zero, which would mean it’s probably got a positive impact. And then we could combine that with the other studies that are producing bell curves that are centered around zero. Then we might bring in our own prior, based on what we know about the benefits of childhood interventions, generally, from other research that’s not about deworming, and then we could fuse that together. We might get an overall estimate, represented by a bell curve, with some spread to represent our uncertainty, which, who knows? Might have 20% or 30% of its weight to the left of zero, depending how you do it. There’s [inaudible 01:11:58] ways to do it.
And so we would say, our best central estimate is that this is doing good, but we are not hyper-confident of it. And that’s an uncomfortable position to be in, but if we’re true expectation maximizers, if we’re being rational about this, then we should still favor the intervention.
Robert Wiblin: Have you scrutinized the experiments that find that there’s not much impact and come up with any possible explanations for why that’s the outcome?
David Roodman: I’m not aware of long-term studies that didn’t find much impact as originally published. Now, maybe there’s some publication bias there. A couple of them, because of my scrutiny, I now read to be saying that. There are lots of short-term studies that find little impact, and no, I have not looked at those.
Robert Wiblin: Did you get a lot of people to check your work, given how contentious this question has been?
David Roodman: Not a lot. However, I always send it to the original authors and all the data and code are posted online. So I hope that people are getting into it. At least, have the opportunity to do so.
Robert Wiblin: If your conclusion about deworming is wrong, what do you think will be the most likely reason? And I guess in this case, wrong would be that it’s clearly good or clearly bad.
David Roodman: I think the thing that I worry about most is that there’s some kind of selection bias. Maybe there are 10 plausibly good interventions out there, that are like deworming, and all have been the focus of similar research. And just by chance, this is the one that got the nice p-value in the original study. Or, to be more precise about it, maybe just by chance there was some true imbalance in the original experiment. Which can happen. You know, that’s generating all these results, and so then we’re gravitating to the one thing that just by chance, is looking good. That’s what I worry about.
Robert Wiblin: Why do you think this question has been so contentious and … why don’t people mostly agree with you that kind of … with low confidence, we can say it’s maybe positive?
David Roodman: Yeah. That’s a good question. I think part of it’s because it’s been studied by people in two different tribes: health and economics. And they bring to it different priors about standards of evidence, which maybe can be viewed as different Bayesian priors, and those priors seem so right to them that they can’t understand the other side.
Reminds me of … I’ve been reading recently on moral psychology, you know, book by Joshua Green where he talks about how we’re evolved for cooperation within tribes in order to compete with other tribes. One of the problems is that different tribes have different concepts of what is right and wrong, and cannot see eye to eye no matter how hard they try.
Robert Wiblin: So is the issue here that the medical tribe is just more skeptical that any treatments work?
David Roodman: Maybe so. I try to give them the benefit of the doubt and imagine that their norms are evolved for a world in which powerful medicines typically do have side effects. In which research may be funded by drug companies, and therefore needs added skepticism. It may be part of that.
Robert Wiblin: So do you think … were people convinced by what you wrote? Are we getting close to the last word, at least, on these old papers?
David Roodman: I know for a fact that the leading public health skeptics of deworming were not convinced by what I wrote. I really would like to try to do some … I feel the impulse to do some kind of formal Bayesian analysis like I describe. Let’s state a prior, let’s synthesize the evidence that we have from different sources, including the stuff that says it’s indistinguishable from zero. Let’s come up with our best estimate for distribution. Let’s not impose a senseless .05 test, since that’s arbitrary. And let’s ask ourselves, “What looks more likely? That it’s helpful or not?”
Robert Wiblin: How hard is that to do?
David Roodman: That’s a good question. I had a conversation just a couple days ago with Ozzie Gooen, if I’m saying his name right. Who created Guesstimate which is a wonderful tool for trying to do computations and explicitly represent your uncertainty. And it’s motivated by exactly this. Can we use his tool or add to it or create some other kind of tool so that we could articulate some of these ideas better than we’re currently doing with our cost-effectiveness analysis in Google Sheets.
It’s tough because it can require a lot of number crunching power just to do simulations, Monte Carlo simulations, and capture the uncertainty. And at the same time, we don’t want to create a black box, because you’re … [inaudible 01:16:14] about Google Sheets is that everybody can see what we’re doing. And good coders are expensive. So, it’s not clear to me exactly what we’ll pull off. I’m hoping we can do something.
Robert Wiblin: Let’s zoom out and talk, perhaps, about academia and research reliability in general. So, most listeners will be well-aware of criticisms of research that’s done in the social sciences and universities and publication bias, p-hacking and so on. I guess you’ve got a lot of experience trying to get reliable conclusions in spite of all of these issues. Do you think these problems kind of overrated, appropriately rated, or underrated, in terms of their magnitude?
David Roodman: I think as a group, they are a big concern. As I think I said earlier, I have found that pure replicability has not been a big issue. Which is something that’s sometimes emphasized. But when I look more closely than a normal peer reviewer looks, I often find problems, and about 50% of the time and a very small sample in my history, I’ve ended up questioning the conclusions.
I’ve wondered what should be done about this, because I think a way of putting it is that the level of review that a published paper has to go through in order to get published is not optimal from a social point … a societal point of view.
Robert Wiblin: Think it’s too easy to get published?
David Roodman: Yeah. You know, if you think about, “What is the value of the reviewer’s time that goes into looking at a paper at a top journal?” Maybe three reviewers look at it for a couple hours, whatever. So that’s going to be measured as a four-figure sum. Maybe $1000, $5000 dollars.
But a paper in principle could have implications measured in billions or trillions of dollars, and that doesn’t seem optimal, given what we know about the potential unreliability. But it also doesn’t seem realistic to expect publishers of journals to be investing ten times more in review. I mean, they’re, I’m assuming their margins are pretty tight and they just can’t do that.
So the question is, if this is societally valuable, and they can’t afford to do it, who’s going to do it? And it might mean that we need new institutions or new kinds of processes outside of academia, or inside, that are funded by major decision-makers. Maybe GiveWell, or Open Phil, or other philanthropies, or government agencies need to be more routinely funding this kind of scrutiny of published research before using it.
Robert Wiblin: Seems like even a bigger problem than the money is the fact that from an academic’s point of view, is they’re peer reviewing papers, they’re not publishing and their career’s not moving ahead. They run the risk of getting fired, basically. You can’t build a career out of doing peer review, I can’t imagine.
David Roodman: That’s exactly right. So that’s a disincentive within academia to invest much in review, essentially.
Robert Wiblin: I actually don’t really understand why they spend so much time doing it given that there seems like there’s very little reward.
David Roodman: If a particular paper overlaps well enough with a person’s interest, they’re still going to be … probably going to be passionate about supporting or disagreeing with it or what have you.
Robert Wiblin: Yeah.
David Roodman: Most academics are passionate about what they do and that carries over.
Robert Wiblin: Do you think research reliability’s getting better as people become more concerned about these issues? Or at least, I mean. I don’t know whether they actually are becoming more concerned about these issues, but I perceive that there’s more talk about it than there used to be.
David Roodman: Well, I guess my empirical answer would be that I don’t know. I don’t have enough data with a time trend. There absolutely is a trend towards requiring data and code for studies to be posted publicly.
Robert Wiblin: And pre-registration, at least, in medicine.
David Roodman: Yeah, and pre-registration is starting to happen. That’s influenced me. And I think that those are good trends. They’re far from universal. Something that I’ve recently come to appreciate- actually, doing the reconstructing the Bleakley hookworm study that we talked about before, is that, it’s one thing to post the data that you fed into your statistical analysis. It’s quite another to post the raw data that you collected from many sources, because that data gets processed and rearranged, aggregated, what have you. And that is … it’s much more rare to see that be made available.
But, the processing that you go … you apply to go from the raw data to the data that you feed into the analysis actually is analysis, and can contain problems, or, you know, hide issues, what have you. And so it also should be scrutinized.
Robert Wiblin: Do you have any other ideas outside of peer review for making research more reliable? Seems like, especially the incentives to get [inaudible 01:34:42] to get right here.
David Roodman: There may be things that a funder could do to create fellowships and awards for the kind of work that we would like to see. Small things like that that help … that generate momentum in a field.
Robert Wiblin: What about people doing careers and kind of being data vigilantes, where they do what you do whenever they see a paper that they think is getting media attention but they don’t really believe, and then kind of make a splash by saying, “No, actually, this is wrong,” and perhaps putting a bit of fear into paper publishers that if they write something too dodgy they’re going to get caught out.
David Roodman: Yeah. I share the spirit. I hope that they will pre-register and publish whether or not they have a gotcha finding.
Robert Wiblin: Oh, okay.
David Roodman: And, if they find they can make a career of it, that’s good. I argued before, my impression is it’s hard to do that because you alienate … you can alienate a lot of people whose friendship and support you may need at some point. And if there’s anybody really good at that, I’d like to meet them, because I could use some help here.
Robert Wiblin: I guess that’s one reason why these people often seem to be outside of the fields that they’re criticizing.
David Roodman: Yeah.
Robert Wiblin: The people who have got the statistical training elsewhere and then they … “Oh, this psychology thing, I don’t believe that.”
David Roodman: I mean, I should say, I think there are ways of being a critic that are … you can minimize the degree to which you piss people off. It depends on how you write, how
you handle yourself.
Robert Wiblin: So, what are you researching at Open Phil these days? Are you working on a new replication or something else?
David Roodman: No. I should mention, I just feel like, for completeness. After I did the Bleakley hookworm paper, I also did the Bleakley malaria eradication paper. He used similar methods, similar data sets. He found that after malaria eradication campaigns in the United States in the 1920’s, in Brazil and Colombia and Mexico in the 1950’s, there were significant gains, long-term, for people’s earnings. For people who came from malaria-endemic regions.
And those results mostly stood up under my replication. I think it’s appropriate for me to mention that not only did I undercut one study but I mostly validated another. So that’s my most recent replication. We’ve submitted versions of those to journals and there’s versions in the public domain.
Now, I’ve switched to something quite different which I’m really enjoying, and it’s all very preliminary, so I don’t feel like I can speak about it very wisely. We are having some good discussions internally, just trying to really think through the key logic behind our choices of what to fund. What we call cause prioritization. I think we’re up at about the $200 million per year level now, and could very well go higher, you know, get on the scale of Dustin Moskovitz [inaudible 01:37:23] his fortune.
But that’s … if and when we do scale up, that really means that we’re no longer in pilot mode. We’ve really got to make some firm decisions about how much we want to put into catastrophic risks versus GiveDirectly versus whatever, versus animal welfare. So we’re having some nice internal discussions about that, which I had never been very involved with, but I am now trying to participate in more, which is leading me to listen to a lot of your podcasts. I’m a big fan.
Robert Wiblin: Excellent.
David Roodman: I’ve been thinking more about moral philosophy than I ever have before. I listened to Nick Beckstead and Will MacAskill and Toby Ord. So, I’m thinking about these kinds of philosophical questions of how much you weigh animals versus people, and the far future versus the present. And then I think we’ll shift pretty soon to trying to do a critical review of the AI safety question. That’s a big potential focus for us is trying to minimize the harm from artificial intelligence.
Robert Wiblin: Do you think your training in, I guess, more concrete empirical work helps you very much with this question?
David Roodman: Not a lot, no. It’s a general … I mean, of course there are stylistic similarities in whatever I do. I want to dig down deep if somebody cites Hume, then I want to go read a bit of Hume. I want to hear different views and try to synthesize. And I’m also bringing in Occam’s razor where I can, to try to find the simplest way of reconciling the different things, ideas.
Robert Wiblin: So, do you think Open Phil and the effective altruism community have, in general, been focusing on the right areas? Or, I guess, if you were running the show, might you have done things differently?
David Roodman: I don’t have an answer to that yet. It’s too important an area.
Robert Wiblin: Too early.
David Roodman: Yeah.
Robert Wiblin: Yeah. I guess, what do you make of effective altruism in general?
David Roodman: I’m a bit guarded by the label. About the label and the phenomenon. I don’t consider myself an EA. I think, maybe unfairly, I’m put off by a literal reading of the term. Effective altruism seems presumptuous to assert that what you do is effective, and implies that what other people are doing is not effective. And that creates a lot of antibodies. Especially given how young most EAs are, it comes off as a kind of a Young Turk thing. You know, can feel a bit arrogant. “We figured it out.”
I’m very much in favor of empiricism. That might be the one -ism that I subscribe to, and I think it’s fantastic that there’s a whole movement, mostly of young people, who are thinking really hard, rigorously, and in a self-challenging way, about what they can do with their lives and how they can do the most good. That’s absolutely fantastic, labels aside.
I do sometimes worry that there’s more intelligence than wisdom in the movement, you know? I’m not even sure if I could articulate that for … maybe you know what I mean?
Robert Wiblin: It’s, what, a lot of potential but maybe not a lot of experience?
David Roodman: Yeah, it’s … you know, whatever wisdom is, it comes from exposing yourself to lots of different thought domains and ways of thinking and problems, and … with openness and humility, you know, ready to question whether you’re the particular kinds of … particular analytical tools that you’ve taken on will still work when you move to a new area.
Robert Wiblin: Yeah, it’s interesting that you bring up the presumptuousness or arrogance of the term effective altruist. People have worried about that from the very beginning, I guess, in 2012. I’ve tried to push “aspiring effective altruist” or “member of the effective altruism community” to be … to not presume that, in fact, you are being effective. But they have more syllables, so they tend to just never really take off. The shortest term usually wins, even if it’s less accurate or more annoying.
David Roodman: Yeah.
Robert Wiblin: So, how can other people develop research skills like yours? Particularly, I guess, if they’re working on their own. Perhaps they’ve already graduated and they’re trying to pick these skills up later?
David Roodman: Well, I would say that just because I did it without going through a degree program, that doesn’t mean you have to do it that way. If you know in advance that’s what you want to do, great. Maybe I already said that.
You need to have a foundation in statistics, and if not that, mathematics. I have found that replicating existing work is a great way to learn stuff, and there are more and more studies that you can download and … whose data and code you can download and start playing with.
And then you need to be ready to read the help manual, and when you hit something that you don’t understand, recognize that and stop, and then figure out how you’re going to educate yourself about a key idea when it seems essential. And then, Wikipedia can help, and you can follow the notes on Wikipedia to sources, and … it’s an iterative process. You can also take classes. I’m sure there are lots of classes on this stuff.
I mean, more broadly, this just reminds me of another fundamental thought. Doesn’t really answer your question, which is … you know, I do sometimes get asked, “What should I study in school?” And I have come to the view that, to the extent that you can manage it, it feels true to who you are, it is better to study technical subjects. Maybe that contradicts what I said before about acquiring wisdom.
But school, at least in the United States, can be terribly expensive, and it’s not as clear what you’re getting from it, but sometimes you can get that … it can be hard to duplicate without being in a formal program. It’s just the discipline that’s imposed upon you to learn to pick up a technical skill, because you’re going to be tested. Whereas I feel like you can always read history and literature on your own. Not to mention historical analysis and literary criticism.
Robert Wiblin: Yeah, that’s interesting. Why is it that that’s easier to do later on? [inaudible 01:42:39] is it just more engaging to most people to read history, than to do maths themselves?
David Roodman: Well, it might get to the slow brain, fast brain distinction. We’re pretty well wired to hear stories. Comes very naturally to us. And history and literature are a little bit like that. We’re not wired, most of us, to code or do mathematics, and so that’s our slow brains and it’s hard work, and sometimes we need the discipline of the structured program to push us along, start to automate some of those skills.
Robert Wiblin: Yeah. Do you think you can learn most of what you need to know by doing this work on your own? Or does that add a lot to be a part of an institution other than university? I more mean like, working at Open Phil, obviously, you have lots of smart people around you. You can run things past them and potentially learn a lot faster.
David Roodman: I work pretty solitarily, so I don’t learn much about the technical stuff from my coworkers. But I do get inspiration and motivation, which is probably essential. It’s one thing for me to say, “I’ll go replicate a study.” It’s another thing for me to be in the privileged position of feeling useful and knowing that what I come up with is wanted, and that really helps me push ahead.
Robert Wiblin: What kind of sub-fields of statistics do you think are most useful for people to learn, and perhaps underrated? Not enough people dive into them?
David Roodman: You know, the field is changing, so it seems like in the last five or ten years there’s been a lot of influence coming in from big data and machine learning. So there’s a new set of techniques … the kinds of things that are used to analyze what makes you click on an email or not are now being transferred to economics to analyze very large data sets.
So probably that’s part of the cutting edge that you’d want to be on top of? My self-education in Bayesian analysis has been pretty ad hoc and ragtag and I probably would have benefited from a more systematic introduction to that. That seems like a good thing to get on top of, partly because those techniques are increasingly useful and practical. Partly because it gives you a broader understanding of what you’re actually doing whenever you do statistics.
Robert Wiblin: Does that kind of big data science used that much in social science? Or is that still kind of classical statistics a lot of the time?
David Roodman: Classically, you can never have more parameters that you’re trying to estimate in your model than you have data points, because then there’s an infinite number of possible imperfect solutions of your … perfect matches between your model and the data. But the modern techniques that will try to push all your parameters toward zero except the ones that seem to be most relevant. So you can actually have more potential [explanators 01:45:47] than you have factors. You can throw lots and lots of things in. And then ask it to find a model that balances explanatory power with parsimony [crosstalk 01:45:56].
Robert Wiblin: To avoid overfitting.
David Roodman: Yeah. That comes out of big data, as far as I understand.
Robert Wiblin: Are they good enough at avoiding overfitting? I mean, I learned the old methods so hearing about that just sends shivers up my spine.
David Roodman: I don’t have personal experience with them so I don’t know. I’ve looked … I’ve read a bit about the methods and they seem appealing to me. They’re fairly elegant.
Robert Wiblin: Yeah. I mean, I guess if they’re using it to design websites and stuff, you’d think that it had a pretty good idea. And they’d be able to even test, then, whether the methods are working.
David Roodman: That’s right.
Robert Wiblin: What about learning to use Bayesian methods? Are they kind of taking off? And have you had to learn any of those?
David Roodman: I think Bayesian methods are becoming more important in econometrics. Partly because applying them requires a lot of computational power, and computers are getting more power. And partly it’s the influence of, I think, big data applications which also use them, and so they’re … that’s starting to influence other areas of practice.
I’ve come to understand pretty late some of the key Bayesian ideas, and I’m still figuring it out. What I’ve come to understand is that a classical statistical study doesn’t actually do what you want it to do. What we want a study to do is say, “Given what we observe in the real world, like in our experiment, here is the probability distribution for the impact of deworming.” But actually research doesn’t tell you that. It does not go from the data that’s out there to the distribution for the impact. It’s actually the other way around. It says, what the research can tell us is, “If the impact is zero, here is the probability that we observe the data that we actually do. If the impact is that it increases income 10%, here is the probability that we observe the data that we actually do.”
So they call it the arrow … the logical arrow there is going from possible values of the true impact to a distribution for the data. It’s backwards from what we want.
Robert Wiblin: One of my colleagues asked me recently, “If I speak to this many people and I get this number of people who report that they could change their career plans, what will be the distribution of the probability of changing someone’s careers in each instance?” I was like, “Well, I can tell you this other useless thing, but I can’t tell you the obvious thing that you’re asking me.” Or, I could, but it would be a lot of work.
David Roodman: Yes, yes. So we get likelihood functions, as they’re called. And to go from a likelihood function to what we actually want, which is the other way, as Bayes shows, in Bayes’ law is that you need a prior. You need to make some assumption a priori, which could come from other evidence, about the distribution of the thing that you’re actually interested in, and that’s an important understanding of the nature of reasoning with statistics. And it matters more when the evidence is weak.
Robert Wiblin: Are there any specific textbooks or resources that you particularly recommend to young people who are trying to get into similar work?
David Roodman: All I can say is I’ve never read a textbook from front to back. There’s some really excellent rigorous ones, beautiful elegant statements of all the theory. But I’ve learned by taking specific papers, struggling to understand them, going to the textbook to read just about whatever technique is being used. Going back, trying to implement, et cetera. It’s an iterative, exploratory process. There’s a lot to be said for learning stuff when you need to know it.
Robert Wiblin: What about the non-statistical and non-mathematical issues? When you’re replicating a project [inaudible 01:49:00], you could do a different statistical approach, or maybe there’s been a conceptual error in how things are being approached. With the case of the recidivism of early parolees. That’s not a mathematical finding. That’s because you understand the actual real world case that these numbers are describing. How did you learn to get good at that kind of work?
David Roodman: I think most of those insights emerged out of skeptical probing of the data. So, if I get a study that’s showing a really statistical significant result, and I’m able to reconstruct that result, what I then like to do is strip it down. So if there are ten control variables, I see what happens if I drop the controls. If data come from 100 countries, I see what happens if I can narrow it down just to 30. Try to isolate where it’s coming from in the most simple way possible, and then I graph it. And that allows me to get down the bedrock of what the actual statistical pattern is, and visualize it.
And I found that sometimes when I try to really drill down, I discover things that lead to alternative explanations. That’s hard for me to provide the details, but that’s kind of what happened with the parole bias story. I got a core graph showing that there was just this really strong negative association over time between how long people were staying in prison, on average, and how quickly they got returned to prison. And it just seemed almost mechanical. I think that was the word. It was like, a law of nature, almost, in this particular data set.
Robert Wiblin: Something else is going on here.
David Roodman: It just felt like it was too good to be true. Too strong. And then I started trying to think about the mechanics of the situation. So that is an important idea. And I think good coders will recognize that too, because it’s about when you’ve got a bug, trying to drill down and make … get the most simple reproduction you can of the problem. Figure out what’s not relevant.
Robert Wiblin: So I’ve kind of developed this rule of thumb when I’m looking at papers. Particularly if I’m just looking at the abstract and the conclusion, to trust them more if the method seems really simple. And the more they’re using some complex statistical method, I’m just like … the more cutting edge it is, the more I’m like, “I don’t really believe this.” Do you endorse that?
David Roodman: I do. Yes. One reason that randomized studies can be so powerful is not just that they’re randomized, but that they can be … they’re so conceptually simple as a result. In the simplest case, you’ve got two groups. One got treated, one didn’t. You look at the average for each, and you see which average was higher, roughly speaking.
And my experience has been the black boxes often hide a lot of problems. And even cause problems.
Robert Wiblin: There’s other issues as well. It’s like, if it’s a cutting-edge technique, maybe people don’t know how to do it very well. Maybe there was a coding error because there’s lots of different steps. That kind of thing. And it also, you wonder like, “Why did you have to use such a complex method to get your result?” It’s like, “Did you just keep fishing? Using an evermore complicated method until you got something below 0.5?”
David Roodman: Yeah, I share your concern about what you might call specification mining. I’m less concerned about bugs, and such, because as I say, in general when I’ve managed to reconstruct, I get about the same answer. But I have found problems. Sometimes from complicated methods where the authors don’t fully understand what’s going on, because [inaudible 01:51:59] sense, nobody can. It’s too complicated. That the results will be less stable than they appear.
Robert Wiblin: And less scrutable.
David Roodman: Well, yes. I mean, a big example on my mind is, it used to be, before there were randomized study of the impact of microcredit, there was a non-randomized study done of the Grameen Bank and some other microcredit programs in Bangladesh. It was extremely clever, lots of equations. Seemed to be very strong, was published in a top journal. And, with a lot of effort, I was able to write a program in Stata that would re-do it and initially made it one key error that was causing me to get the exact opposite result. They were showing that microcredit reduced poverty. I was showing that it increased poverty. I couldn’t figure it out.
And there was an error in what I was doing, but the error shouldn’t have mattered that much, except that the actual estimation process was unstable, which I was actually able to prove eventually once I got a good match that was actually bimodal distribution, the data were compatible with microcredit reducing poverty and almost as compatible with the theory that microcredit was increasing poverty.
Robert Wiblin: I guess an exception to this rule that simple is good is just an observational epidemiological study of people who eat chocolate, and don’t eat chocolate, and then look at their health [inaudible 01:53:07]. In a sense that’s very simple. Just look at the population averages. It’s the simplest thing you could do. But it’s garbage. But I guess because you just have to look … a balance of complexity versus understanding what methods work in principle and which ones don’t.
David Roodman: Yeah. You know, oftentimes the complexity is motivated by the fact that you don’t have an experiment. So you know that there’s all sorts of potential biases and you’re trying to do smart things to remove those sources of bias, but then that creates opacity.
Robert Wiblin: Do you wish you’d done formal study beyond your undergraduate degree? Or maybe even gone into academia? Or do you feel like you’ve done pretty well figuring things out on your own?
David Roodman: I can see it both ways. There was a long period in my life, from my crisis at Cambridge for more than ten years after, where I knew I had something to offer. I had this Harvard degree, but also felt a lot of self-doubt. That began when I failed all my exams and, you know, worried about how that would look and didn’t stay within an established track.
I think if I had known where I was headed, I could have gotten there faster. You know, I could have gotten doing this kind of work faster. That said, if I had gone a conventional track, it might have just literally taken me on a different track, and what I’ve stumbled into is being able to contribute because I’m different. And I feel very grateful for that, and if I could live my life again, I would be very reluctant to risk losing that.
Robert Wiblin: Yeah, how much do you think that you’re owning value because you’re specifically outside of academia? That you don’t face the same incentives. You have different incentives, and you don’t have to do your own original research. You can, instead, focus on this replication and also getting actionable advice that a foundation can use rather than just having elegant methods?
David Roodman: Oh, I think that’s a big part of it. Yeah. If I were in academia, presumably I’d be responding. I’d be facing and responding to different incentives. And would have absorbed a different culture.
Robert Wiblin: Do you think more people should be doing what you’re doing? Is the balance out of whack?
David Roodman: Absolutely. I don’t know of many people who are doing what I do. But you just told me about these … data vigilantes.
Robert Wiblin: Data vigilantes.
David Roodman: I’d love to meet them.
Robert Wiblin: It’s a handful of people, so.
David Roodman: Okay. Yeah. I think we need more of it. I mean, that’s, I’m obviously biased, but it’s just striking, as I say, to overturn about half the things I look at and realize that we need more serious review if we’re going to rely on this stuff for making big decisions.
Robert Wiblin: How can someone get paid to do this, other than working at Open Phil? Are there other funders?
David Roodman: I think it may be hard to establish oneself, but I think that if one is working on policy-relevant stuff, increasingly there ought to be space at think tanks that are appropriate for whatever area you’re interested in.
I know that there are funders who are interested in the state of research and trying to improve it. And there are more and more journals publishing replications.
Robert Wiblin: So you mentioned think tanks and I’ll come back to that in just a minute. But we’ve talked here about, kind of, statistical training. But do you know of any other ways to develop just good, holistic judgment?
David Roodman: Well, it’s a skill that you learn through practice, and so the question is, “Where can you get a job where you’re asked to do it?” I think doing policy analysis, whether you’re doing it for a government agency or a think tank or a member of Congress, can be really good for that. Because any practical policy question has a dozen dimensions. Political, matters of equity, administrative realities. What’s important is to be able to hear out each of those perspectives and then in some way that’s unique to each situation, figure out how to move forward.
Robert Wiblin: I guess one question is, “How do you get feedback?” Such that if you’ve had bad judgment in a case, you know about it. In policy, feedback loop can be kind of weak. So that’s one reason that doing this at a school … or, doing anything where you actually get an answer about whether you were right or whether your predictions were correct is very good for developing judgment, but the situations can sometimes be rare.
David Roodman: That’s right. It depends, I think, a bit on what you’re try … what kind of policy you’re looking at. If you’re working on tax policy, you know, it can go out … and you actually succeed in making a change, then it may be ten years before there’s any opportunity to change it and the effects may be unclear.
If you are … I’m thinking of my wife, who spent many years working for Medicare and trying to come up with new contractual arrangements with doctors and hospitals to try to change the incentives in the healthcare system. She could get certain kinds of negative feedback very quickly. For example, if nobody signed up for a program. Or if, you know, doctors and hospitals just argued vehemently that something was impractical or what have you.
So when you’re trying to get other people to do something as part of your overall program for change, you can find out pretty quickly whether you’re succeeding then or not.
Robert Wiblin: Yeah. I’m just thinking, each week I get an email with new papers from NBER, which I think is The National Bureau of Economic Research. Suppose I could get someone to take those papers, hide the results and then get me to read the rest of it and try to guess what the actual outcome was. That would be one way of training your intuitions a lot faster than you otherwise would. Or, read the introduction and try to figure out what method they use or something like that.
David Roodman: I like that.
Robert Wiblin: Okay, let’s get back to think tank [inaudible 01:58:11]. So, you started your career with nine years at the Worldwatch Institute. Then I guess you went to Center for Global Development. What are the pros and cons of starting your career in a think tank?
David Roodman: The advantages is that you get pretty quickly exposed to an interdisciplinary world. Whatever problem you solve, you’re thinking about … you’re going to learn that there are many dimensions and good ways of thinking about historical, in terms of justice, in terms of administrative issues, et cetera.
And I can imagine that in academia you might come to that less naturally because the tendency in academia is to narrow your center, a focus.
And, at least in the United States, I don’t know about in Europe, the think tank world is pretty fluid, so there are lots of [inaudible 01:58:53] organizations and over time you can make connections and move and rise and grow in lots of ways.
I guess a disadvantage of starting a career there is what I mentioned for myself before. Well, maybe I shouldn’t generalize for that. What I was thinking was just that, I know when I started in think tanks, I stay there a very long time, and so I was removed from practical decision-making by being in government or philanthropy or business. I think at some point that began to cost me because I was less-inspired about new areas to work in and perhaps less able to contribute for lack of understanding of practical realities.
Robert Wiblin: Yeah, so nine years is quite a bit longer than most people stay in their first jobs. What motivated you to stay at just one organization for that long?
David Roodman: Well, technically it was my second job. The first one was when I followed my girlfriend to Philadelphia. That was a year.
It was the sense that I was continuing to grow. The pattern at Worldwatch was that they didn’t hire PhDs, so that was a bit unusual for a think tank. So there was lots of opportunity to rise and gain … without a degree. Typically you would choose one topic and work on it for a year. And that suited me, so I worked on environmental taxation, [inaudible 02:00:01] the environment. I worked on Third World debt, and those led to monographs and one book.
So it was a long time before I felt like I wasn’t growing anymore. Because I was able to move to new topics, and because I was writing in different forms that were increasingly challenging.
Robert Wiblin: Yeah, I very often hear the career advice that early on you should move every couple of years. And I think if the job that you’re in early on is not a good fit, that’s definitely the right advice. But it does kind of surprise me that there isn’t more of a premium placed on expertise. Like, someone’s developed expertise in a particular area, and also the fact that they would have institutional knowledge and things like that. Seems a bit crazy to have organizations turning over almost all their staff every few years. Am I crazy? Because I feel like, if I was leaving job every two years, I’d feel like … I would just be getting started, almost, by that point, because I’ve learned so much in the meantime.
David Roodman: I certainly would never recommend it as a blanket policy. But other organizations I have been in, it’s just clear from how things work that after a couple years, if you start as a junior person, you don’t get as much opportunity to grow, typically. That was the case at the Center for Global Development. That was a little bit more like an academic department in the sense that there were … almost all the fellows were PhDs, basically, except for me. And then there were research assistants, and they didn’t have much prospect of becoming full-time fellows without a lot more research or job experience.
And that might be a bit the case at GiveWell. There’s a fair amount of turnover at the junior level, and … [inaudible 02:01:22] obviously we have some long-time people. And I think that’s because organizationally, it’s kind of a broad pyramid. There’s a lot of work at the … I don’t like lowest, sounds like a pejorative level, but the level of where we’re checking in on organizations, researching new organizations, writing up conversations and so on. Which might start to get old after a couple years. It’s vital work, but moving to the next rung in responsibility, there’s many fewer positions. Just less needed, because it’s so broad, flat pyramid.
So, it’s normal in this case, I think, for people to leave in order to grow.
Robert Wiblin: Yeah. Perhaps what’s more surprising is that it seems like the advice of people in the middle of their career is that, kind of, each time they change organization they can get a pay rise, and potentially quite a significant one. And, it seems odd to me that organizations aren’t willing to pay more to retain people who have experience. I suppose … an alternative thing would be that they’re learning things by hiring people from outside, so that’s creating an information transfer, and perhaps that’s what’s getting valued a lot. Do you have a view on that?
David Roodman: It’s a funny thing. It’s not something I’ve thought about much, nor experienced that much. My intuition is that maybe the search costs for our new employee is so high that when you’re making an offer, you want to reduce the probability they’re going to walk away.
Robert Wiblin: I see.
David Roodman: Because of, you know, a particular increment in salary. [inaudible 02:02:44] would be my guess how that works.
Robert Wiblin: So, coming back to the think tanks. The Worldwatch Institute and the Center for Global Development. Do you think you had much impact there, and do you think people in general can have significant policy impact in think tanks?
David Roodman: I don’t know if I’ve had much impact. You know, as we’ve discussed, my … I’m less practically-oriented, in a sense. I don’t have the concrete decision-making experience, so I’ve always … interested in broad ideas like whether environmental … taxing pollution is a good idea and how it should be done. Hopefully I have educated people, a lot of Worldwatch materials were used in classrooms. So I introduced people to new ideas, and I always see myself as a teacher. A pedagogue. But that’s hard to measure. But I’ve absolutely observed colleagues who had much more focused policy proposals have impact.
Michael Clemens, my former colleague at the Center for Global Development played a really key role, I think, in opening up a particular work visa program to Haitians, after the earthquake there. Getting them another way to make money. Another colleague, Todd Moss, played a key role in bringing about a deal to write down about $30 billion of debt owed by Nigeria, which helped it get beyond the debt crisis.
Robert Wiblin: That’s a lot more money than you could have earned and given away.
David Roodman: That’s right.
Robert Wiblin: What fraction of responsibility do you think he deserves for that policy shift? 1%? 10%? .1%?
David Roodman: I should be careful what I say, I mean, if I might offend somebody who feels like I’m underwriting their contribution. And in the sense, it’s not decomposable into a fraction. It’s more like several things had to happen, and if any one of them didn’t happen, then the whole thing might not have happened. But I believe that CGD was approached by the people in the Nigerian government for help to do some outside analysis, and he did a great job of it in that case.
I think we actually have a write-up on this. I forget if it’s public, but we did a pretty deep review of The Center for Global Development and its impacts.
Robert Wiblin: Yeah, I’ll stick up a link to that. Do you have a sense of how competitive it is to work at think tanks if you’re a recent graduate?
David Roodman: I think it’s … there’s supply and demand, right? I think a lot of recently-graduated PhDs are pretty leery of going into the think tank world, because it basically represents a permanent decision not to pursue an academic career. Because if you are going to spend, say, three or five years at a think tank or do some real work there, and then you decide, actually, you want an academic tenure-track position, then you’re three to five years behind in your life, in a sense.
So that narrows the demand for such jobs. The number of slots is probably not huge. The Center for Global Development, in the 12 years I was there, I think hired two newly-minted PhDs. Michael Clemens and Justin Sandefur. Now, I guess there are a few hundred think tanks, so maybe that’s a few hundred positions over a decade. So it doesn’t seem like a huge number, but I don’t know quite how the supply and demand balance out.
I think what distinguishes you in a competition is, just your seriousness. If you have demonstrated you’re good and you demonstrate interest in the relevant subject areas, and you show that you’re willing to leave academia and do this, I think that right there, that puts you in a pretty small pool.
Robert Wiblin: For the last few years, 80000 Hours has had working in a think tank as a pretty promising option for recent undergraduates [crosstalk 02:05:58] left their undergraduate degree. But I don’t know many people who’ve done it. Do you think that might be because it’s hard to get in, or maybe there’s some nearby options that are more appealing?
David Roodman: I don’t know. I experienced the side of the process after people had been hired. So I’m on the other side of the filter. Most of the research assistants that I saw come and go at CGD, I think had a pretty good couple years. That’s typically how long they would stay. And typically, then, went to graduate school. Some would go into US government, say, work for USAID.
Robert Wiblin: Yeah. You might not know much about these options, but I guess some nearby alternatives are kind of going into civil service directly, or working on Congress as a staffer. Do you have any view on whether that’s kind of a better or worse option than working in a think tank, or … I suppose, it might appeal to different kinds of people.
David Roodman: Sure, I think you can learn a huge amount. The thing with government is it has so much variability there. There are places that are completely stultified or maybe under attack from people at the top in the Trump Administration, and then there are other places … I believe where my wife worked, in Medicare, where there was a lot of innovation going on, and the work is almost automatically at a different scale.
You know, for a think tank, maybe 20 people is big. For a government agency, 20 people is nothing. And so you can quickly become into something that’s really consequential and got a lot of interesting people and a lot of moving parts. And even a lot of flexibility in what role you can play, depending on what you have to offer.
Robert Wiblin: Yeah. There’s someone I know in the UK who managed to end up leading a think tank in their early 30’s and was just very big in the media. Would be in newspapers on a weekly basis. So, I mean it was fairly small organization, but most think tanks, as you say, are really quite small, so if you have star potential, then I think you can rise quite fast.
David Roodman: That’s right. And that might also go for activist NGO groups, you know, non-profit groups that are focusing on policy.
Robert Wiblin: In a normal think tank that’s not the Worldwatch Institute, how much can people advance without getting a PhD, and what exit opportunities do they have?
David Roodman: My experience is limited but my impression is that CGD is typical in that it’s very hard to advance without a PhD. I came into CGD at a middle rung between research assistant and full fellow, having been a so-called senior researcher in my previous job. So I just barely jumped over that stream. And that was unusual, and I think that that’s generally what you’ll find at other think tanks. I mean, there’ll be PhDs who are leading most of the research, and are, in a sense, at the top. There’ll also be people who are formerly-prominent officials in government and … but, depending on who you work for, and that’s something to figure out, you can still be given a major co-authoring role in whatever’s done if you demonstrate the capacity.
Robert Wiblin: Okay. So you mentioned writing a book. How did that go?
David Roodman: Well, I’ve written two books. One at Worldwatch, and one at the Center for Global Development. The second one, which was on microfinance, I think was a much more successful project. That was great. Took a long time. Early on, the communications director, Lawrence MacDonald and I, hit upon the idea of my writing the book in public. This is the microfinance book. So-
Robert Wiblin: Publishing drafts of it, you mean?
David Roodman: Yeah, the idea is … when I’ve got draft chapter, put it online. This was in 2009. And it emerged out of a particular chapter where I kept feeling like I was making more discoveries about the history of microfinance, but, “I must be missing things. Why don’t I share this and people can fill in blanks?”
So we turned that into a blog. So I stumbled into blogging, and I quickly discovered it was way more fun that book writing. I have a natural letter writing voice when I blog, and it just … whereas when I write a book, I feel much more like I’m clearing my throat and standing at a podium before an audience, wearing a suit.
Robert Wiblin: I guess you get a lot more engagement as well, quicker. It’s a more immediately rewarding-
David Roodman: That’s right. So I found many days it was much easier to blog than write the book. So it slowed the book down, but it brought in a lot of attention to the project and I became much more better-known. And I got some good comments on the book, but … and it ultimately served the purpose of the book, which was to communicate my thinking to the community and beyond, the microfinance community and beyond it. So it was a very useful thing.
It happened that I started just as … a very eventful few years in the world of microfinance. There were some credit bubbles that exploded. The prime minister of Bangladesh went after Muhammad Yunus, the creator of microcredit in Bangladesh. Went after him personally. There was more. And so I became kind of a leading figure tracking these developments and interpreting them.
And then ultimate that fed into the book and made the book better, and the book included about 12 of my favorite blog posts. So.
It was a pretty unusual thing then, and still fairly now, to have … to be writing a book in public. I’m tempted to say I was the first one to do it properly defined, but that’s … probably there’s an exception out there.
Robert Wiblin: Did that book have policy impact? It seems like if a lot of people were reading it, it could have affected funding levels.
David Roodman: I think it was part of a process within the microfinance world of developing and communicating more realistic expectations of what microfinance could do. I wouldn’t say it did it alone.
Robert Wiblin: Do you want to give people a sneak peek of what the conclusion was on that one?
David Roodman: Oh, yeah. Well. So, microcredit had a reputation, partly because of this very complicated study that we talked about earlier that was non-randomized, of being a silver bullet against poverty. Especially when given to women. Very powerful set of messages. Actually, that was another important set of events that occurred while I was blogging, which was the first randomized trials came along and showed essentially zero impact on poverty.
But I came to appreciate, through one fantastic book, Portfolios of the Poor, that financial services, although they’re invisible, are really important. Being poor in a poor country means that your income is not only low but unpredictable. You have all sorts of risks that are uninsured. You don’t have health insurance, life insurance, whatever.
And so your financial life is much more about uncertainty than it is for people who have salaries. And one of the ways you manage that is by finding mechanisms, formal or informal, to turn your small and often unpredictable income streams, your income increments, into pots of money that are there in a case of emergency, which you can then use for getting your husband’s broken leg treated, what have you.
And most financial services can play that role. You can save for your kids’ education, because in most developing countries, the reality is you’ve got to pay for school. You can borrow, and then the need to make those monthly payments or weekly payments is helpful discipline for you, in effect. Sort of a retroactive way of making you save for schooling. In some cases you can get informal insurance. You can also get money transferred to you from your son in the United States or in the big city.
So this credit, insurance, savings and transfers are all financial services, but in a sense, they all help people solve the same problem, and what’s funny is that poor people have less money, and because as a result, they actually need financial services more. So the project of creating self-sustaining institutions that can deliver services like that to poor people, to me, seems a fundamentally good one.
What I worry about is that the enthusiasm from donors and investors to support microfinance has created a strong tilt towards microcredit, and as we know, loans … credit is very dangerous when it’s given too enthusiastically. It can get the borrowers in trouble. It can create bubbles, and so on.
And so I came to favor microfinance as a big project but actually advocate for people putting less money into it, which would shift microfinance institutions more towards … away from taking money from investors in the United States and other places, and more towards taking savings from local people, which is both a useful service and an alternative source of finance.
Robert Wiblin: I think that’s insurance as well. Microinsurance?
David Roodman: Yes. To the extend it’s practical. There are a lot of difficulties in providing microinsurance, but there are also some nice encouraging examples of it.
So I came to feel that it was, by no means a silver bullet, but could do a lot of good for … in the long run, for fairly modest investments, if you can build self-sufficient institutions.
Robert Wiblin: Okay, I just wanted to flag for listeners that there was a bunch of other research that you’ve done that we could have talked about including changing murder rates in the US over the last couple of years, the effect of alcohol taxes on health, the effect of immigrants on wages, the Commitment to Development Index, which is … attempt to look at how policies other than just aid policy affect the developing world. And effects of development and in particular reducing infant mortality on fertility.
So I’ll stick up links. I’ll try to be pretty comprehensive and stick up links to all of these different projects if one of those topics grabs you.
So, are there any high impact job opportunities available at the moment that you think listeners might be able to fill? I suppose there’s some vacancies at the Open Philanthropy Project that I think … it’s possible that those will have closed by the time this episode goes up, although I think they’re planning on continuing to take applications on a kind of rolling basis.
David Roodman: Yeah, I don’t know the time of it. I know that we’re doing a big hiring push for research analysts on the Open Philanthropy side.
Robert Wiblin: So this, as a last question, are there any books or papers or podcasts you’ve listened to recently that were particularly memorable that you want to recommend to people?
David Roodman: Well, I listened to some great podcasts in this series. I encourage people to look at the earlier episodes. And on the suggestion of my colleague Luke Muehlhauser, I looked at … read a couple books recently on moral psychology, which is the scientific study of moral reasoning in humans.
And, you know, occasionally you read a book that changes how you see the world a bit. And these books were like that. I especially enjoyed … well, the one that’s by Jonathan Haidt. I’m blanking on the title. It’s not-
Robert Wiblin: Righteous Mind.
David Roodman: Righteous Mind. Yes. And that’s a great book. It’s very clear messaging, and then there’s one by Joshua Green called Tribal Mind, I think. And I especially found Joshua Green’s book exciting because he first talks about some of what we understand about how people think when it comes to moral questions. He emphasizes that we are evolved for cooperation within groups in order to compete with other groups. And a lot of what you might call social emotions. Embarrassment, vengefulness, so on.
Robert Wiblin: Loyalty.
David Roodman: Loyalty. Yes. Can be explained, you know, through evolutionary, maybe just-so stories, as fast brain devices that mold our behavior to improve our ability to cooperate within groups.
But the problem we face today is that many of our biggest challenges require cooperation between groups in order to solve them. And we don’t have these automatic mechanisms in place that just automatically make us want to cooperate across groups.
And he argues … then he moves from the empirical analysis of how people think morally to proper moral philosophy, and he argues for utilitarianism on pragmatic grounds as a common currency across tribes, across moral tribes. You may believe that there’s a right to life, I may believe that there’s a right to choice when it comes to abortion. That’s not an argument that ever can be resolved through reason, but we can both agree with the basic ideas of utilitarianism, that everybody’s happiness should count about the same, and that happiness itself is really important thing, good thing to maximize. And then he tries to analyze, for example, the abortion problem through a utilitarian lens. But the big point is, he’s not saying that, as I read it, that utilitarianism is the true morality, and he’s not saying that it’s a complete morality for any one person or any one tribe, but that it’s a way for tribes to work together to solve real problems, and I just find the whole set of ideas to be really intriguing and exciting.
Robert Wiblin: My guest today has been David Roodman. Thanks for coming on the show, David.
David Roodman: It’s been a pleasure.
Robert Wiblin: Just a reminder that you can get updates on all the articles we publish by joining our research newsletter at 80000hours.org/newsletter.
The 80,000 Hours Podcast is produced by Keiran Harris.
Thanks for joining – talk to you next week.