#103 – Max Roser on building the world’s first great source of COVID-19 data at Our World in Data

History is filled with stories of great people stepping up in times of crisis. Presidents averting wars; soldiers leading troops away from certain death; data scientists sleeping on the office floor to launch a new webpage a few days sooner.

That last one is barely a joke — by our lights, people like today’s guest Max Roser should be viewed with similar admiration by COVID-19 historians.

Max runs Our World in Data, a small education nonprofit which began the pandemic with just six staff. But since last February his team has supplied essential COVID statistics to over 130 million users — among them BBC, the Financial Times, The New York Times, the OECD, the World Bank, the IMF, Donald Trump, Tedros Adhanom, and Dr. Anthony Fauci, just to name a few.

An economist at Oxford University, Max Roser founded Our World in Data as a small side project in 2011 and has led it since, including through the wild ride of 2020. In today’s interview, Max explains how he and his team realized that if they didn’t start making COVID data accessible and easy to make sense of, it wasn’t clear when anyone would.

But Our World in Data wasn’t naturally set up to become the world’s go-to source for COVID updates. Up until then their specialty had been in-depth articles explaining century-length trends in metrics like life expectancy — to the point that their graphing software was only set up to present yearly data.

But the team eventually realized that the World Health Organization was publishing numbers that flatly contradicted themselves, most of the press was embarrassingly out of its depth, and countries were posting case data as images buried deep in their sites, where nobody would find them. Even worse, nobody was reporting or compiling how many tests different countries were doing, rendering all those case figures largely meaningless.

As a result, trying to make sense of the pandemic was a time-consuming nightmare. If you were leading a national COVID response, learning what other countries were doing and whether it was working would take weeks of study — and that meant, with the walls falling in around you, it simply wasn’t going to happen. Ministries of health around the world were flying blind.

Disbelief ultimately turned to determination, and the Our World in Data team committed to do whatever had to be done to fix the situation. Overnight their software was quickly redesigned to handle daily data, and for the next few months Max and colleagues like Edouard Mathieu and Hannah Ritchie did little but sleep and compile COVID data.

In this episode Max explains how Our World in Data went about filling a huge gap that never should have been there in the first place — and how they had to do it all again in December 2020 when, eleven months into the pandemic, there was still nobody else to compile global vaccination statistics.

We also talk about:

  • Our World in Data’s early struggles to get funding
  • Why government agencies are so bad at presenting data
  • Which agencies did a good job during the COVID pandemic (shout out to the European CDC)
  • How much impact Our World in Data has by helping people understand the world
  • How to deal with the unreliability of development statistics
  • Why research shouldn’t be published as a PDF
  • Why academia under-incentivises data collection
  • The history of war
  • And much more

Final note: We also want to acknowledge other groups that did great work collecting and presenting COVID-19 data early on during the pandemic, including the Financial Times, Johns Hopkins University (which produced the first case map), the European CDC (who compiled a lot of the data that Our World in Data relied on), the Human Mortality Database (who compiled figures on excess mortality), and no doubt many others.

Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type 80,000 Hours into your podcasting app. Or read the transcript below.

Producer: Keiran Harris
Audio mastering: Ryan Kessler
Transcriptions: Sofia Davis-Fogel

Highlights

Our World In Data

Max Roser: The mission that we have is to present the research and data on the world’s largest problems so that we can find ways to make progress against those problems. That’s what we want to achieve. And then the way that we hope to achieve this is basically twofold: On the one hand, there are already lots of people who have this idea, that work on a big global problem, whether it’s a health issue, or a disease, or whether it’s getting kids into school and improving education. And for all of those people, we just want to serve the information in an accessible and understandable format. That’s really the key of that.

Max Roser: And then we also have this second part of the mission, where we would like to expand that community. Where I think lots of people are just very concerned about large global problems, but are very far away from the research and from the data. And so they have only a poor understanding of what the problems really are, how they compare in size, whether we are making progress or not, etc. And so we have this idea that many people are concerned, but don’t actually know that it’s possible to do something about these problems, and that there are ways forward. And so we try to encourage them and motivate them to see that it’s worth dedicating time, effort, even a career, possibly, to support this kind of work.

How OWID prioritise topics

Max Roser: One difference between our team and much of academia is that we are much more demand-driven. So while a lot of academics have this idea that they want to work on a particular project and hope for the best that someone picks it up, we try to speak a lot to the users that we have and hear what they see as gaps and where they see that something’s missing. And then try to respond to the demand that is there. That could also be journalists that we value, and we hear from a lot of experts what they would want to see. For example, last week I was having dinner with Will MacAskill, and he said there’s an internal document that’s basically his wishlist that’s growing and growing as he wants to see more research and data. And we take this into account, obviously.

Max Roser: Another key consideration is who the people on our team are and what kind of work they can contribute. For example, Saloni Dattani, who joined us very recently, is an expert in health issues, she cares a lot about mental health. I think that’s an aspect that is under-discussed on a global scale. And so she was the perfect fit to take on this project. And then another key consideration always is that we try to fill some niche where others often haven’t already done great work.

COVID-19 gaps that OWID filled

Max Roser: It’s shifted a bit. At the beginning, we had two projects that we were working on. It was very much also just explaining how to think about these numbers. So we wrote these explanations of what the case fatality rate is and how it’s different from the infection fatality rate, and in which ways these tool measures might differ. I know that lots of journalists were relying on these very basic explanations of the key metrics, and that became less and less important just because the media became better and better over the pandemic. Last February it wasn’t great, but I think now there have been really amazing journalists in many key outlets that do a great job. There was just a huge improvement. We did less and less of that.

Max Roser: And we did focus on the other job that we had of just cataloging the data, and that was bringing some of the existing international statistics together, the straightforward ones, cases and deaths, but then also increasingly things like excess mortality statistics, and hospitalization figures. Back in March, there were really several strengths to the work that we were doing on COVID. One was explaining the key metrics and helping readers to make sense of what the case fatality, the infection fatality rates are, how these two measures differ, how they might change over the course of an outbreak, how the amount of testing is impacting these metrics. And that was helpful for a lot of journalists at the time. We were in touch with many journalists, and then we did less and less of that because the journalism around COVID just hugely improved over time. It wasn’t great back in February, March last year, but it’s pretty awesome right now. There are lots of really great people.

Max Roser: Then another strand of the work was to compile these aggregate datasets on international statistics. We took the confirmed cases and confirmed deaths from the European CDC, but we then later compiled many more sources. We did the testing database, that was in our hands. We did aggregate the data on excess mortality. We compiled survey information from people’s opinions, and then much more recently, obviously, the vaccination data. And the key job there was, on the one hand, to produce a clean spreadsheet that other people could then rely on, that they could pull into their reporting — so big news organizations could just pull our .csv file every morning and then update all of their statistics on their outlets. And the other one was to build the tools that make it possible to explore the data right there on our site, because that’s something I think even the ECDC was struggling with. They made the data available, but the tools to then actually visualize the data and compare countries and understand the data, that wasn’t great. And it’s also not their job in a way. Right? So I think that’s fair.

Vaccination data set

Max Roser: I think the aspect that we haven’t spoken so much about is the vaccination dataset. And that’s honestly one that I really got wrong. I would have not expected somehow that there would be so much attention being paid to the vaccination dataset, and I would have also not expected that it would be on us to produce this dataset. And I was really wrong on both counts. The vaccinations started in December, and we were all tired. We were all looking forward to Christmas. And then Edouard was suggesting that we should probably compile the vaccination data, since the first person here in the U.K. was vaccinated just then. And I was like, “No, we’re not going to do this. This is just… It can’t be on us.” I was like, “I want to take some time off. I want to see my parents over Christmas. We’re not going to do that. And also surely someone will do it.” And then his point was like, “Well, no one is doing it yet. And also it’s something that’s going to be fine if we just do a weekly update.” That was the point that convinced me: “It’s going to be fine if we do a weekly update.” And then he started by himself. And obviously there was so much attention to it. I think at the beginning, it was just because of this story that Israel was vaccinating so much faster than everyone else, and there was this huge discrepancy. Lots of countries, again, struggle to make their data available. So there was much more focus on it.

Max Roser: And suddenly it became this really full-time job just for him. He was sitting in his apartment in Paris, producing this spreadsheet that everyone from The Economist, The Financial Times, The New York Times, the WHO, the U.S. CDC, everyone is relying on his figures. On the one hand, I think it should, again, not be the situation. On the other hand, I’m really proud of him pushing for that and building this dataset and informing the public about what’s going on with the global vaccination roll out.

Data reliability and availability

Max Roser: It’s a huge issue. And we touched on it earlier in the discussion, where I was mentioning that I think too many resources go into the analysis of often poor data, and too little resources are actually given to improve the data in the first place. And data from poor countries is one of those areas where we know that the data is often of poor quality. But that’s true for data across many sectors, even in rich countries. And it’s one of our key efforts in this work to find this balance, because we always live in a world with imperfect data. There’s no data that’s ever perfectly accurate, but we have to see where the data is actually able to tell us something about the world, and what we should know about the data to make sense of it, and where we should instead stay away from it. So it’s a massive concern. And just really at the heart of our work.

Rob Wiblin: Does that maybe imply that, in your work, less might be more? That perhaps trying to be really comprehensive and presenting lots of different pieces of data about all of the countries could end up replicating this unreliable data? And perhaps people would end up giving too much weight to data that they should mostly be ignoring. And perhaps you should just focus on a smaller range of numbers, the highest quality, most reliable ones.

Max Roser: Yes, that’s a concern. And I think we often decide against working on a project because we don’t have good data to report on. But it’s also the case that there are arguments that push you in the opposite direction. For example, on mental health, all of the research that we have suggests that mental health disorders are just very common in countries around the world. And we want people to take mental health much more seriously as a global health issue. Now, the data that is available on a global scale is of poor quality. And so you’re caught up in this dilemma where, on the one hand, the data is poor and you would rather not publish it. On the other hand, by not making the data available, and presenting no information about it, you leave this massive global health problem without any reporting. And in this case, we decided that the data should be made available and should be discussed. And we’re working now with Saloni, who just joined us in the team, to get a better understanding of global mental health issues.

How to be more like OWID

Max Roser: Our key role and why our work is more interesting is that we have this cross-country international perspective, right? That’s of course something that a particular country wouldn’t do, and so there’s no fault there. On the other hand, it’s an issue also just of software. The tools that are readily available just aren’t that great. The fact that much of our team is actually busy building visualization tools shows that, right? One of the hardest things in getting Our World in Data off the ground was to find funding for that work, because foundations, anyone that gives out grants, wouldn’t quite understand that. They were like, “You want to build visualization tools? Why don’t you just use Excel?”

Max Roser: Or if they’re a bit fancier, like, “Why don’t you use Tableau?” And so the tools don’t exist as easily. If someone is looking for tools, I would give a shout out to Datawrapper. That’s a really nice software solution for publishing data on the web. And we are also trying to do this ourselves. Our advantage is that you can extract data out of a large database and visualize it, and all of our work is open source. And it’s also part of our mission to make those visualization tools more used in government agencies, international organizations.

Rob Wiblin: Yeah. It’s an interesting phenomenon. I’ve heard from quite a few people that it’s very hard to get resources to build platforms and tools that then other people are going to use to put to a lot of purposes. And I wonder whether it’s just that it’s harder to demonstrate at that stage what the concrete output is going to be, and what the value is going to be. It’s sufficiently far away from the final delivery point that it’s hard to prove to a grantmaker that it’s worthwhile.

Max Roser: Yes. And also there is not really a delivery point. I think that’s also a key aspect. You want to build these kinds of tools to be usable over a long period of time, and that’s where many of these tools fall short. There are often great efforts that are one-offs, right? There’s new money at this international organization, now they’ve built this amazing presentation of their data. But two years later, the web has moved on. There are new tools. The databases have changed, and it’s this half-broken tool. And so the key in our work is to keep maintaining this infrastructure and keep developing over a long period of time. And I think that would be something that would help international organizations if they see the presentation tools as part of their core work, that they actually have to have an in-house team that keeps on working with them, and that they don’t outsource it to an agency that does a one-off job that’s good for the big launch, but is broken a year later.

Related episodes

About the show

The 80,000 Hours Podcast features unusually in-depth conversations about the world's most pressing problems and how you can use your career to solve them. We invite guests pursuing a wide range of career paths — from academics and activists to entrepreneurs and policymakers — to analyse the case for and against working on different issues and which approaches are best for solving them.

The 80,000 Hours Podcast is produced and edited by Keiran Harris. Get in touch with feedback or guest suggestions by emailing [email protected].

What should I listen to first?

We've carefully selected 10 episodes we think it could make sense to listen to first, on a separate podcast feed:

Check out 'Effective Altruism: An Introduction'

Subscribe here, or anywhere you get podcasts:

If you're new, see the podcast homepage for ideas on where to start, or browse our full episode archive.