Scared Straight was a government programme that received billions of dollars of funding, and was profiled in an award-winning documentary. The idea was to take kids who committed crimes, show them life in jail, and scare them into embracing the straight and narrow.

The only problem? One meta-analysis found the programme made the kids more likely to commit crimes, and another more recent meta-analysis found no effect.1

Causing this much harm is rare, but when social programmes are rigorously tested, a large fraction of them don’t work.2

So, even if you’ve chosen a pressing issue, it would be easy to end up working on a solution to it that has very little impact.

Meanwhile, research finds that among solutions that do have positive effects, the best interventions within an area are far more cost effective than average, often achieving over 10 and sometimes 100 times as much for a given unit of resources.

In this article, we explain what we think the current research implies about how much solutions differ in effectiveness, why this should change how we approach making a difference, and how to find the best solutions within an area of practice.

How much do solutions differ in how well they work?

In recent years there’s been a wave of advocacy to stop the use of plastic bags. However, convincing someone to entirely give up plastic bags for the rest of their life (about 10,000 bags) would avoid about 0.1 tonnes of CO2 emissions. In contrast, convincing someone to take just one fewer transatlantic flight would reduce CO2 emissions by over one tonne — more than 10 times as much.3

And rather than trying to change personal consumption in the first place, we’d argue you could do even more to reduce emissions by advocating for greater funding of neglected green technology.

This pattern doesn’t just hold within climate change. Its significance was first pointed out in the field of global health, by Toby Ord’s article “The Moral Imperative toward Cost-Effectiveness in Global Health.”

He found data that compared different health interventions in poor countries (e.g., malaria nets, vaccines, types of surgery) in terms of how many years of healthy life they produce per $1,000 invested.

This data showed that the most cost-effective interventions were around 50 times as cost effective as the median, 23 times the mean, and almost exactly obeyed the ‘80/20 rule’.

Intervention cost effectiveness in global health in order of disability-adjusted life years (DALYs) per $1,000 on the y-axis, from the DCP2.

This is an incredible finding, because it suggests that one person working on the most effective interventions within global health could achieve as much as 50 people working on a typical intervention.

We’ve since seem similar patterns among:

In fact, it seems to show up wherever we have data.

There are some reasons to think that this data overstates the true differences in effectiveness between different solutions — especially those that you can actually work on going forward.

One reason is that often the very top solutions in an area are already being done by someone else.

A more subtle reason is regression to the mean. All the estimates involve a lot of uncertainty and error. Some interventions will ‘get lucky’ and end up with errors that make them look better than they are, and others will be unlucky and look worse.

In fact, even if all the solutions were equally effective, random errors would make some look above average, and others below average.

If we compare the interventions that appear best compared to the average, it’s more likely that they benefited from positive errors. This means that if we investigate them more, they’ll probably turn out worse than they seem.

This is what seems to have happened in practice. For instance, the data that Toby used found that deworming children was among the most cost-effective solutions in the dataset. However, the charity evaluator GiveWell discovered errors in the estimates in the study, and the studies behind them have been called into question in a debate that became known as the ‘Worm Wars.’

But here’s the final twist: in 2021, after 10 years of further research and scrutiny aiming to correct for these effects, GiveWell still recommends deworming charities as among the most cost effective in global health. This is true even though their best guess is that deworming is only about 10% as cost effective as the original estimates.

So, the original estimates were too optimistic, and overstated the spread from best to typical, but deworming still appears to be much more cost effective than average — likely still over 10 times better.

In fact, global health experts still believe that the best ways of saving lives in poor countries are around 100 times cheaper than the average.4

On the other hand, the data in these studies could also understate the true differences in effectiveness between solutions. One reason is that the data only covers solutions with easy-to-measure results that can be studied in trials, but the highest-impact ways of doing good in the past most often involved research or advocacy, rather than measurable interventions. This would mean the very best solutions are missing from the datasets.

For instance, a comparatively small number of people worked on the development of oral rehydration therapy, which now saves around one million lives per year. This research was likely extremely cost effective, but we couldn’t directly measure its effectiveness before it was done.

Looking forward, we think there’s a good case that medical research aimed at helping the global poor will ultimately be more cost effective than spending on direct treatments, increasing the overall degree of spread.

There’s a lot more to say about how much solutions differ in effectiveness, and we’d like to see more research on it. However, our overall judgement is that it’s often possible to find solutions that bring about 10 times as much progress per year of effort than other commonly supported solutions in an area, and it’s sometimes possible to find solutions that achieve 100 times as much.

Technical aside: theoretical arguments about how much solutions differ

You might think that it’s surprising that such large differences exist. But there are some theoretical arguments that it’s what we should expect:

  • There isn’t much reason to expect the world of doing good to be ‘efficient’ in the same way that financial markets are, because there are only very weak feedback loops between having an impact and gaining more resources. The main reward people get from doing good is often praise and a sense of satisfaction, but these don’t track the actions that are most effective. We don’t expect it to be entirely inefficient either — even a small number of effectiveness-minded actors can take the best opportunities — but we shouldn’t be surprised to find large differences.

  • A relatively simple model can give a large spread. For instance, cost effectiveness is produced by the multiple of two factors; we’d expect it to have a log-normal distribution, which is heavy-tailed.

  • Heavy-tailed distributions seem the norm in many similar cases, for instance how much different experts produce in a field, which means we shouldn’t be surprised if they come up in the world of doing good.

  • If we think there’s some chance the distribution is heavy-tailed and some chance it’s normally distributed, then in expectation it will be heavy-tailed, and we should act as if it is.

What do these findings imply?

If you’re trying to tackle a problem, it’s vital to look for the very best solutions in an area, rather than those that are just ‘good.’

This contrasts with the common attitude that what matters is trying to ‘make a difference,’ or that ‘every little bit helps.’ If some solutions achieve 100 times more per year of effort, then it really matters that we try to find those that make the most difference to the problem.

This is why we highlight finding an effective solution as one of the four key drivers of your long-term impact, along with how pressing the problem is, how much leverage you have, and your degree of personal fit. It can be worth working on a less effective solution if that path does well on the other three factors, but it’s one key thing to consider, especially once you reach the stage of your career where you’re trying to directly tackle problems rather than build career capital.

So, how can we find the most effective solutions in an area?

Hits-based vs. evidence-based approaches

There are two broad approaches among our readers:

  1. The evidence-based approach: look for data about how much progress per dollar different solutions achieve, ideally randomised trials, and focus on the best ones.
  2. The hits-based approach: look for rules of thumb that make a solution more likely to be among the very best in an area, while accepting that most of the solutions will be duds.

We generally favour the hits-based approach, especially for individuals rather than large institutions, and people who are able to stay motivated despite a high chance of failure.

Why? As noted, the best solutions typically can’t be measured with trials, and so will be automatically excluded if you take the evidence-based approach. This is a serious problem because if the best solutions are far more effective than typical, it could be better to pick randomly among solutions that might be the very best, rather than to pick something that’s very likely to be better than average but definitely not among the very best.

Another argument is that many institutions with social missions seem overly risk-averse. For instance, government grant agencies get criticised heavily for funding failures, but the employees at such agencies don’t get much reward when they back winners. This suggests that individuals who are willing to take risks can get an edge by supporting solutions that have a high chance of not working. You can read more about the arguments for a hits-based approach.

One response to the hits-based approach is that it relies on deeply uncertain judgement calls, instead of objective evidence. That’s true, but we contend that you can’t escape relying on judgement calls. All our actions lead to ripple effects lasting long into the future. By taking an evidence-based approach, even if we suppose the evidence is fully reliable, at best you can measure some of the short-term effects, but you’ll need to rely on judgement calls about the longer-term effects, which comprise the majority of all the effects. Learn more about cluelessness.

Given that judgement calls are unavoidable, the best we can do is to try to make the best judgement calls possible, using the best available techniques.

What does taking a hits-based approach involve in practice?

In short, we need to seek rules of thumb that make a solution more likely to be among the very best in an area (while unlikely to be negative). This could involve methods like the following:

In practice, this often ends up with a focus on research, movement building, policy change, or social advocacy, and on solutions that are unfairly neglected or seem unconventional.

In applying these frameworks, you can either try to do this analysis yourself, or find experts in the area who understand the need to prioritise and can do the analysis on your behalf (or a mix of both).

We’re generalists rather than experts in the areas we recommend, so we mainly try to identify good experts — such as those on our podcast — and synthesise their views about how to tackle the problems we write about in our problem profiles.

If you aim your career at tackling a specific issue, however, then you’ll probably end up knowing more about it than us, and so should put more weight on your own analysis.

Many of the areas we recommend are also small, so not much is known about how best to tackle them. This makes it easier than it seems to become an expert. It also means that your input on which solutions are best is especially valuable.

Following expert views also doesn’t necessarily mean choosing ‘consensus’ picks, because those might fall into the trap of being pretty good but not best. Rather, if a minority of experts strongly supports an intervention (and the others don’t think it’s harmful), that might be enough to make it worth betting on. In brief, we aim to consider both the strength of views and how good the interventions seem, and are willing to bet on something contrarian if the upsides might be high enough.

For example, Sophie Rose switched to studying pandemic prevention due to our advice. When COVID-19 broke out, she realised human challenge trials could speed up vaccine development by many months, saving millions of lives. They also had a lot of public support, but weren’t being permitted by regulators. So she co-founded 1DaySooner, which signed up volunteers for such trials, and one was eventually started in London in early 2021.

This didn’t turn out to be fast enough to make a noticeable difference to the COVID-19 outbreak, but we think it was a bet worth taking.

More importantly, if there’s another pandemic, 1DaySooner’s work means human challenge trials could be ready to go right away, enabling us to develop vaccines far faster.

Further reading

You might also be interested in

Read next: What is leverage, and how can you get it?

By considering a broader range of ways of contributing, you can mobilise more resources for the best solutions, and have a greater impact.

Enter your email and we’ll mail you a book (for free).

Join our newsletter and we’ll send you a free copy of The Precipice — a book by philosopher Toby Ord about how to tackle the greatest threats facing humanity.

Notes and references

  1. van der Put, Claudia E., et al. “Effects of awareness programs on juvenile delinquency: A three-level meta-analysis.” International Journal of Offender Therapy and Comparative Criminology, vol. 65, no. 1, 1996, pp. 68-91. Archived link

    Petrosino, Anthony, et al. “Scared Straight and other juvenile awareness programs for preventing juvenile delinquency: A systematic review.” Campbell Systematic Reviews, vol. 9, no.1, (2013), pp. 1-55. Archived link

  2. The percentage that work or don’t work depends a lot on how you define it, but it’s likely that a majority don’t have statistically significant effects.

  3. According to the 2020 Founders Pledge Climate & Lifestyle Report, just one round trip transatlantic flight contributes 1.6 tonnes of CO2. Figure 2 of the same report shows the comparatively negligible effect of reusing plastic bags.

  4. Caviola, Lucius, et al. “Donors vastly underestimate differences in charities’ effectiveness.” Judgment and Decision Making, vol. 15, no. 4, 2020, pp. 509-516. Link

    We selected experts in areas relevant to the estimation of global poverty charity effectiveness, in areas such as health economics, international development and charity measurement and evaluation. The experts were identified through searches in published academic literature on global poverty intervention effectiveness and among professional organizations working in charity evaluation.

    We found that their median response was a cost-effectiveness ratio of 100 (see Table 1).

  5. Some common types of bottleneck:

    – Funding – additional financial resources from donations or fundraising.
    – Insights – new ideas about how to solve the problem.
    – Awareness and support – how many people know and care about the issue, and how influential they are.
    – Political capital – the amount of political power that’s available for the issue.
    – Coordination – the extent to which existing resources effectively work together.
    – Community building – finding other people who want to work on the issue.
    – Logistics and operations – the extent to which programmes can be delivered at scale.
    – Leadership and management – the extent to which concrete plans can be formed and executed using the resources already available.