#39 – How much should you change your beliefs based on new evidence? Spencer Greenberg on the scientific approach to solving difficult everyday questions
Will Trump be re-elected? Will North Korea give up their nuclear weapons? Will your friend turn up to dinner?
Spencer Greenberg, founder of ClearerThinking.org, has a process for working out such real life problems.
Let’s work through one here: how likely is it that you’ll enjoy listening to this episode?
The first step is to figure out your ‘prior probability’: your estimate of how likely you are to enjoy the interview before getting any further evidence.
Other than applying common sense, one way to figure this out is ‘reference class forecasting’. That is, looking at similar cases and seeing how often something is true, on average.
Spencer is our first ever return guest (Dr Anders Sandberg appeared on episodes 29 and 33 – but only because his one interview was so fascinating that we split it into two).
So one reference class might be, how many Spencer Greenberg episodes of the 80,000 Hours Podcast have you enjoyed so far? Being this specific limits bias in your answer, but with a sample size of just one – you’ll want to add more data points to reduce the variance of the answer (100% or 0% are both too extreme answers).
Zooming out, how many episodes of the 80,000 Hours Podcast have you enjoyed? Let’s say you’ve listened to 10, and enjoyed 8 of them. If so 8 out of 10 might be a reasonable prior.
If we want a bigger sample we can zoom out further: what fraction of long-form interview podcasts have you ever enjoyed?
Having done that you’d need to update whenever new information became available. Do the topics seem more interesting than average? Did Spencer make a great point in the first 5 minutes? Was this description unbearably self-referential?
In the episode we’ll explain the mathematically correct way to update your beliefs over time as new information comes in: Bayes Rule. You take your initial odds, multiply them by a ‘Bayes Factor’ and boom – updated probabilities. Once you know the trick it’s even easy to do it in your head. We’ll run through several diverse case studies of updating on evidence.
Speaking of the Question of Evidence: in a world where Spencer was not worth listening to, how likely is it that we’d invite him back for a second episode?
Also in this episode:
- How could we generate 20-30 new happy thoughts a day? What would that do to our welfare?
- What do people actually value? How do EAs differ from non EAs?
- Why should we care about the distinction between intrinsic and instrumental values?
- Should hedonistic utilitarians really want to hook themselves up to happiness machines?
- What types of activities are people generally under-confident about? Why?
- When should you give a lot of weight to your existing beliefs?
- When should we trust common sense?
- Does power posing have any effect?
- Are resumes worthless?
- Did Trump explicitly collude with Russia? What are the odds of him getting re-elected?
- What’s the probability that China and the US go to War in the 21st century?
- How should we treat claims of expertise on nutrition?
- Why were Spencer’s friends suspicious of Theranos for years?
- How should we think about the placebo effect?
- Does a shift towards rationality typically cause alienation from family and friends? How do you deal with that?
Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type 80,000 Hours into your podcasting app. Or read the transcript below.
The 80,000 Hours podcast is produced by Keiran Harris.
Highlights
The other way I see it happen is the people around you in your social circle, they only accept as a community, accept certain intrinsic values, right? So, you think “I’m only supposed to have certain intrinsic values”, but you actually have other ones. So what you try to do is you try to recast your other intrinsic values in terms of these ones that are acceptable or valid. An example of this would be when someone says something like, “Well, the reason I cultivate friendship is because it makes me more effective.” I think what’s going on there very often is kind of self-deception where people feel like they’re not supposed to have these intrinsic values, so they kind of trick themselves into thinking that they don’t have other intrinsic values.
This leads to what I think is a really important and subtle point that whatever you believe is objectively true about value in the universe, and whatever you believe is the right values you’re supposed to have according to your social group, those things are independent from what your current intrinsic values are. Your current intrinsic values is like a psychological fact. Like, a scientist could study your intrinsic values and answer the questions. It’s a fact about yourself. It’s not a fact about the universe. It think it’s very important to draw that distinction and say you might believe in objective moral truth and you might believe you figured out what it is, but it doesn’t mean that’s what your intrinsic values are right now. Maybe you aspire to make your intrinsic values match them more closely, but they’re probably not there yet, and if you don’t draw that distinction, you might end up having this very bizarre doublethink where you basically deceive yourself and create these weird psychological effects, potentially harmful.
So if you think about the marathon [example], people might think it’s difficult, which could cause them to be underconfident, they may not view it as part of their personality or character potentially, so that could explain why they might be underconfident. People are not experienced so that could explain why they’re underconfident. They might say it’s not a matter of personal opinion so that could explain why they’re underconfident and so on. So you see, the marathon one lines up pretty well with a bunch of these traits, actually, to explain why people might be underconfident.
Bayesianism is [a] probabilistic, mathematical theory of how much to change your beliefs based on evidence. The way I like to think about this is actually using an English language phrase that we like to call the Question of Evidence: how likely would I be to see this evidence if my hypothesis is true, compared to if it’s false?
So let’s say if you got a three to one ratio, like you’re three times more likely to see this evidence if my hypothesis is true than if it’s false, that gives you moderate amount of evidence. If it’s 30 to one, you’re 30 times more likely to see this evidence if your hypothesis is true than if it’s false, that’s really strong evidence. If it’s just one, you’re as likely to see this evidence if your hypothesis is true than if it’s false, that’s no evidence, it actually doesn’t push you in any way, and then if it’s one in three, one third, then that pushes you in the opposite direction, it’s moderate evidence in the opposite direction. One in thirty would be strong evidence in the opposite direction.
So, I think what a lot of people don’t realize is all these equations and so on can be very confusing but there’s the English language sentence which is the only way to say how strong evidence is. That is the right sentence. Other sentences that sound similar, actually are not the right way to quantify evidence.
Very often when people are doing long, complex projects, they underestimate how long they’ll take. They’ll also underestimate often how much it will cost, how many researchers they’ll use. There’s a bunch of theories around why this is. It’s commonly called the Planning Fallacy. One theory is that, when you’re thinking about a long, complex project, you know that on some level that some things will go wrong, but it’s very hard to know what will go wrong. It’s gonna be sort of idiosyncratic. So your brain kind of smooths over and says, “Well this thing is probably not gonna go wrong and that thing’s probably gonna go wrong,” and so each individual thing, you kind of assume it’s gonna go right. But of course, there’s a good chance something will go wrong that you never even thought of.
Articles, books, and other media discussed in the show
- Clearer Thinking interactive tools and mini-courses:
- Mind Ease app for anxiety
- Positly for recruiting research participants
- Spark Wave
- Placebo effects are weak: regression to the mean is the main reason ineffective treatments appear to work on DC’s Improbable Science blog, 2015.
- Does spinach have lots of iron? Is that a myth? Or is it a myth that it’s a myth? Who Will Debunk The Debunkers? by Daniel Engber on FiveThirtyEight.
- TEDxBlackRockCity – Spencer Greenberg – Improve Your Life With Probability
- EA Entrepreneurship – Spencer Greenberg – EA Global 2015
- Thinking, Fast and Slow by Daniel Kahneman, 2011.
- Destined for War: Can America and China Escape Thucydides’s Trap? by Graham Allison, 2017.
Transcript
Robert Wiblin: Hi listeners, this is the 80,000 Hours Podcast, the show about the world’s most pressing problems and how you can use your career to solve them. I’m Rob Wiblin, Director of Research at 80,000 Hours.
I listen to about 20 hours of audio on my phone each week, so I care a lot about making sure I’m absorbing what I’m hearing as efficiently as possible. If you subscribe to podcasts on your phone app, you’ll be able to customise the speed at which it plays each show, so it’s not so slow you get bored, but not so fast you can’t keep up. Many apps also let you chop out any silences, though rather than do that I prefer to just play it a bit faster.
The densest podcasts I listen to at 1.2x and the fluffiest at 2.3x or so.
Personally I use BeyondPod Podcast Manager on Android and I hear the best podcast app for iPhones is called Overcast.
I also use Pocket, an app which grabs articles from the internet for you to read on your phone later. Fortunately it’s able to read it to you from your phone, because my hands get sore if I have to scroll down my phone too much. It takes a while to get used to its style, as its pronunciation isn’t perfect, but after a few hours I didn’t have trouble following any more.
For audiobooks I use Audible, but you probably already know about that one.
Regardless of what I’m listening to I pay close attention to the speed so I don’t waste time listening to something less quickly than I could handle. You might like to do the same!
I should quickly apologise about the audio quality at the start of this episode which is bit worse than usual. Fortunately I noticed I was recording on the wrong microphone a quarter of the way through, so it’s back to normal quality from 36 minutes in.
Alright, here’s Spencer.
Robert Wiblin: Today I’m speaking with Spencer Greenberg, whose reviews the first time around were so positive that he has the privilege of being the first guest on the show to be interviewed twice. Spencer is still an entrepreneur, and he founded Spark Wave, a startup foundry which creates novel software products designed to solve problems in the world such as scalable care for depression and tools for improving social science.
He also founded ClearerThinking.org, which offers free tools and training programs that have been used by over 150,000 people which are designed to improve decision making and reduce biases in people’s thinking, which is going to be a big topic of our conversation today.
His background is in mathematics and he has a PhD. in Applied Math from NYU with a specialty in machine learning. Spencer’s work has been featured in major media outlets such as the Wall Street Journal, the Independent, Lifehacker, Gizmodo, Fast Company, and the Financial Times, and I could have said more, but I won’t. So, thanks for coming back on the podcast, Spencer.
Spencer Greenberg: It’s great to be here. Thanks so much for having me.
Table of Contents
Latest projects
Robert Wiblin: So, today we hope to talk about how to reason through difficult questions more accurately and assess evidence and also when we should expect to be overconfident and when to be underconfident. But first, what have you been focused on in the ten months since we last spoke?
Spencer Greenberg: Yeah, so things have been moving along with Spark Wave. As you mentioned, we build new software products to try to solve problems in the world. If it’s sufficiently promising after we build the first version of the product, we’ll recruit a CEO, ultimately the goal is spinning out a new company. And so, we actually have four portfolio companies right now where we built the first version of the product, we’ve recruited a CEO. So, one of them is UpLift which is our automated program for helping people with depression. We’re trying to reduce depression symptoms. And that’s run by Eddie Liu. And then we have Mind Ease which is trying to be the best software product in the world for calming you down when you’re having significant anxiety. It’s run by Peter Brietbart. We have Clearer Thinking run by Aurora Quinn-Elmore. And Positly which is our recruitment platform for recruiting people for studies. It’s run by Luke Freeman.
Robert Wiblin: Cool. So, I think last time we talked about UpLift. How’s that going? And when might people expect to be able to use it?
Spencer Greenberg: Things are going really well. You know, we have people using UpLift every week, but we’re still in a closed beta, because we have some things we know we need to improve in that system. Almost done with that. We’re going to release it really soon.
Robert Wiblin: Okay. Nice. And how are the other projects coming along? Will we be able to see the results any time in the next year or two?
Spencer Greenberg: Yeah, definitely. I mean, I’m really excited about all four of these products. So, Positly, we actually just got Positly out on positly.com. Basically the idea is trying to accelerate social science research by helping you recruit the right participants the right time, get them to do the right thing. So, let’s say you’re running a study, a scientific study and you need 100 women age 20 to 24 and maybe you want to divide them into groups and have them do different things, and track them over time, have them fill out surveys at different points. That kind of research can be very complicated, expensive, time consuming and difficult to get right, and we want to make things like that much, much easier to help accelerate people’s work. And then also on the side of developing products, you know, if you look at why companies fail, very commonly, probably the most common reason is they build a product that people really don’t need or want. So, we think there’s huge room for helping companies be better at releasing new, better products by getting the data back from potential customers much earlier and more robustly. And so we want to help with that as well.
Robert Wiblin: How are you recruiting these people? Is this all Mechanical Turk?
Spencer Greenberg: So, the idea of the system is essentially to have multiple back ends to help you get just the right participants you need. Whether you’re weighing I want the cheapest sample possible really fast, or I want a representative national sample, or I want a sample from a specific place, so we’re going to be building out these different back ends. So the first back end we have built out is Mechanical Turk where we can layer a lot of really nice features on top of it to make it really nice for researchers. But then when we have these different back ends coming for just getting the perfect sample for you.
Robert Wiblin: Yeah, and what’s new with like Clearer Thinking?
Spencer Greenberg: So, Clearer Thinking we got a lot of real exciting stuff happening. We had some long term studies that we’ve been working on. One is actually on creating happiness habits, and this one I’m really excited about ’cause I literally think it increased my happiness five percent. I know that’s a bold claim and it stuck now three or four months now, you know? Fingers crossed. I don’t know whether that will last forever. But the idea of that project is imagine that you had let’s say 30 happy thoughts a day that you wouldn’t normally have, like what would that do to you? I think that’s a reasonable hypothesis that if people actually had 30 extra happy thoughts throughout the day, it would make you a happier person. So, how do we make that happen? And so, our idea was try to make that happen in the simplest way possible. By installing a trigger in your life that already happens, let’s say 20 or 30 times a day and making that trigger so strongly associated with the happy thought that it actually triggers the happy thought.
So for me, when we were developing the study and I was testing it, the trigger I set for myself was before I just check social media, when I have the thought of checking social media, I would think of something I’m grateful for, and it just worked to a crazy degree. I’m just way more grateful than I’ve ever been in my life. It’s just really been a wonderful experience.
Robert Wiblin: Is this a bit temptation bundling? So, maybe checking social media is a bit bad habit, but then you’re bundling it with this thing that’s helpful to you?
Spencer Greenberg: Well, I think of temptation bundling a little differently. I do use temptation bundling.
Robert Wiblin: Yeah?
Spencer Greenberg: It’s also one of my favorite techniques. I use it to work out, where basically there are TV shows that I only watch if I’m working out while I’m doing it. Choose a motivator that makes the working out more feel more pleasant. But this is a little bit different. This is really about creating a trigger in your environment. The trigger’s a certain type of thought. And really the whole thing is that at first it would only happen when I would go check social media. Now it happens whenever I think about social media and I often don’t even go check it. I just have a thought, oh, wait, then I feel grateful, and then, you know, I don’t even need to check the social media. It’s really cool.
Robert Wiblin: Okay, so the goal is to think of something you’re grateful for. Now do you ever have difficulty doing that?
Spencer Greenberg: So, this is another funny thing that happens psychologically. So, at first I would every time think of something I’m grateful for, but then it started getting like almost shorter and shorter until almost just like a happy mental maneuver. It’s a little hard to describe. But it’s just sort of like this happy feeling that I produce at that moment. After having done it hundreds of repetitions of it. We’re going to be running a study on this technique. It’s coming up soon, and then if we have good results, the goal would be to turn it into a tool so anyone could use the tool to try to create these happy associations to you know, generate 20 or 30 more happy thoughts every day.
Robert Wiblin: Yeah. What kind of happy thoughts do you have? Are you able to share them?
Spencer Greenberg: For me?
Robert Wiblin: Yeah.
Spencer Greenberg: Really for me I used gratefulness, so trying to think about all the things I have in my life. I have a funny way of doing gratitude that is especially effective for a certain type of people. Where I imagine myself not having the thing, and then I remember that I have it, and I feel really great about that. But that technique doesn’t necessarily generalize, ’cause I’ve showed that technique to some other people, and they say when I imagine not having it, it’s like so crushing that it doesn’t work. But that’s how I do it. So, I’ll be like,”Oh wow. Tea exists in the world. It’s such a wonderful thing.” And then I’ll contrast that. Imagine tea didn’t exist and it was impossible to get tea, and then I’m like, “Oh, but there is tea.” And that makes me feel really good.
Robert Wiblin: Yeah, after we spoke last time I tried out the UpLift app. I managed to get a link to it. It was closed beta. It was really good. Unfortunately, I became happier after a couple of weeks of using it, so I didn’t manage to complete the course.
Spencer Greenberg: I’m really glad to hear that. Not necessarily causal, but regardless whether it’s causal, I’m really glad to hear that you’re feeling happy.
Robert Wiblin: Yes. I mean, I thought it was very good material and I was thinking easy to motivate myself to do it ’cause you had all of these entertaining explanations and examples through it. So, as soon as that goes public, I think I’ll announce it on the show and encourage people to give it a crack.
Spencer Greenberg: That’s great.
Intrinsic and instrumental values
Robert Wiblin: All right. Let’s move on to a talk you gave at EA Global last week on intrinsic values and instrumental values where you did a big survey at trying to figure out what do people actually value. For its own sake, what did you find out?
Spencer Greenberg: This is a topic I’ve been thinking about a lot lately. So, first let me try to define an intrinsic value. So, there are many things we all value like money, food, et cetera. But most of the things we value we only value for their effects. So, imagine something and say, “Do I value it?” If the answer’s yes, you can say to yourself, “Well, would I continue to value it even if it got me nothing else? Even if there were no effects?” If so, it’s an intrinsic value. Or another way of putting that is an intrinsic value is something you value for its own sake not [only] for the effects that it has. So for many people like their own pleasure is an intrinsic value, but money is not an intrinsic value, because money if it couldn’t get us any effects, like say, there was hyperinflation, it would be worthless, like we wouldn’t care about money at all.
Robert Wiblin: Okay, so who did you survey?
Spencer Greenberg: We ran a survey where we got a bunch of different participants. Some of them in fact effective altruists and fully identify, some of them partially identify as effective altruists, and some of them non effective altruists. We looked at four different variables: at age, how liberal or conservative people are, gender, and whether they identify as an effective altruists, and we looked at what intrinsic values people report having. And this was a very challenging study to run, because it’s very hard first of all to get people to understand exactly what an intrinsic value is, but it’s even harder to get them to kind of do the proper mental maneuver of assessing whether they value a thing even if it doesn’t have any effects.
So, all those results, you definitely have to take them with a grain of salt keeping in mind that it’s tough to really get people to do this properly.
Robert Wiblin: Yeah, so you had some checks at the beginning of the survey, right? To see whether people had understood, then you tried to exclude people who didn’t get what the questions actually meant.
Spencer Greenberg: Exactly. We excluded a lot of participants because basically we taught to them about the idea, then we gave them a quiz to make sure they understood it, and if they got more than one question wrong on the quiz, we excluded them, and then we actually had them state in their own words what an intrinsic value is and we read all of those and if they seemed incompatible with our definition, we also excluded them. So, we exclude a lot of participants before analyzing the data, but even still, you still have to take it with a grain of salt. But yes, we try hard to make the data as valid as we could.
Robert Wiblin: So, what were the main finding from the study?
Spencer Greenberg: Yeah, so we looked at which intrinsic values were associated with different groups of people. To give you some examples, so for conservatives, we found they tended to report an intrinsic value around religion and things like retribution where those who’ve done bad things get punishment for it. And also the preservation of existing values, which makes a lot of sense, ’cause to some people that’s really what conservative means. Kind of preservation values. On the other hand, people who were reporting liberal, they tended to value things like animal well-being, nature, and happiness of strangers.
Now, I should point out what we’re looking at here are the things that differentiated each group. In other words, not the things that necessarily conservatives valued highest and liberals valued the highest, but the things that differentiated them from all other groups.
Robert Wiblin: What was distinctive.
Spencer Greenberg: Exactly. Exactly. So, we also looked at females versus males. Females reported intrinsic values around kindness or caring, diversity and human freedom more often. Males reported ones around their own selfish interests, the interests of people they know personally, and also around pleasure of strangers. When going to older versus younger, we found that older people reported intrinsic values around being cared about or trusted also about society’s morality in general. Younger people around things like animal life-spans and that they themselves are admired and also that the people they know have pleasure. And finally, looking at effective altruists versus non-effective altruists. There really we found a very clear pattern, effective altruists tended to value the happiness and suffering of all conscious beings, and that was really the big differentiator from other groups.
Robert Wiblin: Yeah, I think you broke them into universal and non-universal values, is that right? Things that are like not specifically about your life or about people you know and ones that were specifically either you or people you know or things about your own life.
Spencer Greenberg: Exactly. I like to categorize intrinsic values in three groups. “Self” ones that are like, “my own pleasure”, things like that. Then there’s community ones which are about people that are special to you. So either people from your in-group or people who are your friends or whatever. And then universal ones are all the others. They’re not necessarily about people, but they could be about people. They could say I care about the suffering of all conscious beings or I could be I care about there being beautiful things in the world. Right?
Robert Wiblin: Yeah. And what did you find there?
Spencer Greenberg: Well, one of the interesting things that most people actually did report having at least one universal intrinsic value. I think that’s kind of interesting, because universal intrinsic values are reasons for people who don’t know each other, essentially people even on different sides of the world from each other, to cooperate and work together because there’s something they care about that’s not about themselves, not about the people they know personally, and they can kind of collaborate. I think one way to think about effective altruism is it’s people that have universal intrinsic value around reducing suffering around the world, increasing well-being, getting together to figure out how do we do that. How do we all work together to increase that.
Robert Wiblin: Yeah, so I guess we’ve talked about some of the differences, but just like what are the just boring findings? What are the [things] most people care about?
Spencer Greenberg: So, if we look at people who don’t identify as effective altruists we found that 82% of them reported that “I love other people” as an intrinsic value. Kind of interesting. “That I myself feel happy” was 74%, not too surprising. “That I continue to care about other people” 74%. “That beautiful things continue to exist” 71%. “That people I know personally feel happy” 69%. So, nothing like super surprising, but there’s a long list of these that a lot of people report as intrinsic values. Maybe they’re making a mistake. Maybe they’re misunderstanding their intrinsic values, but I think it’s at least worth keeping an open mind that maybe people actually do kind of intrinsically value these things.
And one way to think about that, like what is really going on here is that if we think of our brain as a machine, one of the operations that our machine brains have is this, “I value this thing” operation. Right? And so, it isn’t necessarily that strange or surprising to say there might be quite a lot of different ideas or concepts or things that our brain does this “I find this value” operation even when you remove all the effects of that thing.
Robert Wiblin: Yeah, so for example, we’ve got 71% of people who didn’t identify as effective altruists valuing that beautiful things continue to exist. Do you think that you’d get that high number if you said that beautiful things continue to exist, but nobody sees then, which kinda seems like it should be a requirement if you’re ruling out any impacts of that.
Spencer Greenberg: That’s a really good question. I imagine that probably would reduce the number ’cause it would cause people to like second guess and double think about that as an intrinsic value, but interestingly enough I know two people who are very savvy philosophically and introspective and they both claim beautiful things existing if nobody sees them are actual intrinsic values. I had this talk with them, they’re like, “No, no, no.” I get it. I actually value these things even if it’s on an alien world that no-
Robert Wiblin: That no one ever sees.
Spencer Greenberg: … conscious beings will ever see. So, you know, I think what we’re really getting at here is a psychological thing, right? When I’m talking about intrinsic values, I’m talking about a psychological phenomenon. I’m not talking about like some universal truth, and so there’s isn’t necessarily a wrong answer to what someone’s intrinsic values could be.
Robert Wiblin: Yeah, so what did effective altruists tend to value?
Spencer Greenberg: Yeah, so as you could very much imagine effective altruists tend to value things around suffering and happiness of conscious beings, but they also, about 84% of them said that people I know personally feeling happy is an intrinsic value. It makes a lot of sense. 81% the animals feel happy. 80% that I suffer less than I do normally. So, I think this is really interesting, because I think as a community if you look at kind of what is unique about the effective altruistic community with regard to intrinsic values, like there’s lot of unique things about it, but with regard to intrinsic values I think it really is about how much the effective altruism community values the suffering and well-being of all conscious beings, but if you look at effective altruists individually, many of them at least report having other intrinsic values. In fact, many intrinsic values. So, you get a much broader view that maybe you lose when you’re kinda staring at the community as a whole.
Robert Wiblin: Yeah, were there any non-welfare things that effective altruists were particularly interested in?
Spencer Greenberg: I think that effective altruists, quite a few of them said that they continue to learn being an intrinsic value. 40% said that “people I know are able to get the things they want”. That’s a little bit of a preference satisfaction thing. 33% said “that humanity does not engage in immoral acts”. So, that’s kind of interesting.
Robert Wiblin: Interesting. So there they might have kind of a justice conception, so even if injustice is being perpetrated, even if they are in effect welfare, they might be against it. Yeah, so were there any surprising results maybe that made you question whether the methodology worked?
Spencer Greenberg: Well, you know, I definitely, like I said, definitely take this with a grain of salt. Like, I was pretty surprised by the number of people reporting beauty being a value. And you know, it does make me wonder whether people are fulling interpreting it properly. But I think this exercise of trying to figure out your own intrinsic values is a really useful thing to do. I actually see five reasons to try to figure your own intrinsic values out.
Robert Wiblin: Yeah, go for it.
Spencer Greenberg: If you wanna jump into that. Yeah, so the first reason I think it can be useful to figure out your own intrinsic values has to do with what I call “value traps” and trying to avoid them. A value trap is when you associate something with an intrinsic value, because it used to be associated or maybe you just had the false belief that it was and then you pursue the thing without actually getting the intrinsic value out of it. So, an example of this might be maybe when you were young like not having that much money reduced your autonomy. So, you associate money with autonomy. So, you end up getting a career where you make tons of money, but you work so many hours you actually have very little autonomy. But you continue doing it, because you have inertia and you don’t really pay attention to the fact that it’s not getting you the intrinsic value you were seeking. And I think this is actually shockingly common how often we kind of do these things that were like vaguely associated with intrinsic value, but don’t get the intrinsic value out of it.
Robert Wiblin: Yeah, do you think actually people by association end up valuing the thing terminally. So, you wanted to get this job because you thought it would have pretty good consequences but then it just becomes so hooked in with your mind as something that you desired and worked towards that in fact you do just end up valuing it for its own sake?
Spencer Greenberg: Because it’s a psychological phenomenon, I can’t for sure that people couldn’t come to value it intrinsically, but I think usually what happens is they just aren’t making a careful distinction between intrinsic and-
Robert Wiblin: Instrumental values.
Spencer Greenberg: Yeah, instrumental values, and so what happens is because they’re not making that distinction, they just keep pursuing the thing ’cause they think it’s valuable, it’s valuable, it’s valuable, but if you actually force them into facts mode, they’re like, “Really? Do you think that money is inherently valuable even if it’s useless?” They’d be like, “No, probably not.” And then they’d start to separate out that value and realize it’s not intrinsic value.
Robert Wiblin: Do you have any examples from your own life where you confused terminal and instrumental value?
Spencer Greenberg: Well, you know, I think that as someone who loves learning, I think it can be easy, and I think I’ve made this mistake of like viewing learning as good in and of itself. And I think learning can actually be kind of a dangerous trap in that way, because there’s something about the wiring of many people that where you’re learning and you feel like you’re making progress, right? But maybe you’re learning something that’s useless. Maybe you’re learning something that’s even wrong. But your brain doesn’t care, right? You’re learning and you’re feeling good. You feel like you’re improving.
Robert Wiblin: It’s stimulation.
Spencer Greenberg: It’s stimulation, but it’s also like if you’re playing a video game at least you know you’re not doing something productive. When you’re learning, you kind of like, you may be doing this thing, I’m productive, because it feels productive. That’s actually the worst kind of non-productive activity, right?
Robert Wiblin: Yeah, that’s interesting. I do know some people who said they value learning for its own sake even if it has no positive consequences.
Spencer Greenberg: Absolutely. And quite a few people in the survey said that. I don’t think that I value it for its own sake. I think I value it because of the effects it has.
Robert Wiblin: Yeah, I’m the same and I put to them, what if you just had a bunch of true facts that were stored in a hard disk somewhere that no one ever actually plugged in? Like in a sense there’s more like knowledge in the universe encoded somewhere … and I guess it’s very exquisite that it has not effects on any [crosstalk 00:19:19].
Spencer Greenberg: Yeah, I imagine that’s good intuition pump. And this is the thing about intrinsic values, we have to use these intuition pumps to like do these thought experiments carefully, and our intrinsic values will kind of shift as we do thought experiments, and so for example, when you start considering scope, right? You imagine one person suffering, you’re like, “that’s bad”, and you imagine ten people suffering, “that’s worse” a thousand people suffering, a million people suffering. At first I think people will be like, “Yeah, that’s worse and worse.” But their intuitive feeling will almost be as though a million people suffering is only slightly worse than one person suffering, which is clearly, you know. When you think about it, you’re like, “No, no, no, it’s way, way worse.”
And that is a intuition pump and I think it’s a lot of people caring more about suffering and realizing that value of suffering is kind of almost sort of an unbounded value in the sense that l there can be so much potential for suffering that it can be a value that becomes extremely important. Whereas, other ones like maybe about personal suffering are more bounded in a way.
Robert Wiblin: Yeah, okay so that’s the first reason to have to worry about this distinction. What’s number two?
Spencer Greenberg: So, number two is helping us plan. So, something when we think about something as being like the good thing to do, like something we want, but we don’t carefully think about what is the intrinsic value we’re getting out of it and we make a really kind of inefficient plan to get the intrinsic value. So, for example, imagine that you really want to understand the universe. That’s a kind of intrinsic value of yours and you associate that with being a tenured professor ’cause your professors can sit around and think a lot of about the way things work, and so you decide to become a tenured professor and you have this really crazy like 15 year plan of how you’re gonna get there and you start pursuing it. But if you’d notice that the main reason you were drawn to that was for the intrinsic value, you may have been able to make a much more efficient plan. Like just spend your free time studying the way things work, and maybe that would’ve achieved most of the intrinsic value without this tenure plan. And this goes to the idea of like goal factoring, it’s sometimes taught in applied rationality type workshops.
Robert Wiblin: Yeah, do you just wanna quickly describe goal factoring?
Spencer Greenberg: So, the idea of goal factoring is that if you think about how you’re going to achieve a goal and then you think about why you want to achieve the goal, once you’ve broken that down, you can start considering, “Okay, I’m trying to achieve this goal for reasons A, B, and C.” Can I come up with another plan that might get me A, B, and C also? Maybe there’s a better way to do it than my original plan of how to get there.
Robert Wiblin: Yeah, I guess shockingly often you find that you weren’t accomplishing the goal in the most direct way.
Spencer Greenberg: Exactly. Exactly. So, I think that’s one of the benefits of understanding intrinsic values, ’cause the intrinsic values are sort of like the end nodes of your value system. So, they’re like you know, those are the things you’re in some sense trying to get according to your own value system, so if you don’t know what those are, it can be hard to make efficient plans.
Robert Wiblin: Yeah, why do you think it is that humans kind of by design seem to get fixated on intermediate nodes rather than thinking through about what they’re trying to accomplish ultimately and then going directly there?
Spencer Greenberg: It’s an interesting question. I think that the human brain just doesn’t clearly differentiate between value and intrinsic value. So, it blurs those two things together and it kind makes sense, because … okay, getting food … very, very useful for survival, right? The fact that it’s not an intrinsic value if that was demotivating to humans that might not be great for survival, right? It’s like very important we try to get food. Food is not the end goal, but like you know, the fact that the brain treats it as an end-goal a lot of the time would kinda make sense, right?
Robert Wiblin: Yeah, so a lot of the time it doesn’t matter too much, but then sometimes it causes you to go really astray when you have-
Spencer Greenberg: Yeah, every once in while it causes you to like spend your entire life doing something pointless, ’cause you never clearly separated your intrinsic values from your other values, right?
Robert Wiblin: Yeah, I wonder if … well, one thing would just be our brains are just like not that great and they make kind of random errors or they just don’t reflect us sufficiently. Another one might be that the environment’s changed such that our ancestors they were around what was more simple and they didn’t have to spend a lot of time reflecting on terminal or instrumental and intrinsic values, whereas these days our plans tend to be more complicated, have more steps, more intermediate outputs and so now we need to reflect more on this system to kind of level about whether we’re actually accomplishing the thing that we ultimately want.
Spencer Greenberg: That’s a good point and I would also say there are a lot of things that cause human behavior that are not intrinsic values, right? Habits, for example. Or reward and punishment through the mechanism of operative conditioning. Automatic biological responses. So, there are many things that drive our behavior. I think of intrinsic values as like one of these drivers of behavior. And the metaphor I like to use is intrinsic values are kind of like a beacon shining off in the distance. Like most of the time you’re in your boat and you’re just focused on rowing and you’re trying to dodge the waves that are hitting the boat, but like every once in a while you kinda look out and you’re like, “Where am I trying to get to again? Why am I rowing in the first place? Am I headed in the right direction?” That’s the kind of role intrinsic values play.
Robert Wiblin: All right. That’s number two, what’s number three?
Spencer Greenberg: So, number three is that I think understanding your intrinsic values can help you better understand and handle a kind of social guilt that’s pretty common where you know, we all are exposed to the values of other people. You know, our parents growing up, our community, our current friends, and so on, and when their intrinsic values are different than our own, it can create these really weird feelings. It’s like everyone else around you really values this thing, and you don’t really value it that much. Maybe you really value this other thing that they don’t value. And so you start feeling like oh, there’s something wrong with me. I’m an imposter. Maybe I’m a bad person. Maybe you feel guilt.
An example of this might be like you know, your parents expect you to have such and such career, but that career doesn’t get you the things that you intrinsically value, and so, like what’s wrong with you? You know? Why are you a bad child or whatever. And so, I think once you reframe this in the view of intrinsic values, and you’re like, “Wait a minute. So, my parents have these intrinsic values. My friends have these and I have these.” It just helps you understand it and potentially feel much less guilty and just kind of be like, “Okay, this is what’s happening.” I think it can help you relate better to those communities as well.
Robert Wiblin: Okay, yeah. I completely agree. What’s number four?
Spencer Greenberg: So, number four, this one’s subtle, is I think understanding your intrinsic values can help you avoid a kind of strange double think that I think sometimes occurs. The double think is around, it’s really the last one actually, it’s when you think that you’re only supposed to have certain intrinsic values, but you actually have others. So, I’ve seen this happen in two ways. The first way is that you believe there’s an objective truth about what’s valuable. Like maybe you’re convinced on some philosophical theory of morality and you think the only thing that is valuable is what that says, like utilitarianism, for example.
The other way I see it happen is the people around you in your social circle, they only accept as a community, accept certain intrinsic values, right? So, you think I’m only supposed to have certain intrinsic values, but you actually have other ones. So what you try to do is you try to recast your other intrinsic values in terms of these ones that are acceptable or valid. An example of this would be when someone says something like, “Well, the reason I cultivate friendship is because it makes me more effective.” I think what’s going on there very often is kind of self-deception where people feel like they’re not supposed to have these intrinsic values, so they kind of trick themselves into thinking that they don’t have other intrinsic values.
This leads to what I think is a really important and subtle point that whatever you believe is objectively true about value in the universe, and whatever you believe is like the right values you’re supposed to have according to your social group, those things are independent from what your current intrinsic values are. Your current intrinsic values is like a psychological fact. A scientist could study your intrinsic values and answer the questions. It’s a fact about yourself. It’s not a fact about the universe. It think it’s very important to draw that distinction and say you might believe in objective moral truth and you might believe you figured out what it is, but it doesn’t mean that’s what your intrinsic values are right now. Maybe you aspire to make your intrinsic values match them more closely, but they’re probably not there yet, and if you don’t draw that distinction, you might end up having this very bizarre doublethink where you basically deceive yourself and create these weird psychological effects potentially harmful.
Robert Wiblin: Yeah, so I think I often hear people say that the reason I’m just having fun or I’m going on holiday, I’m not actually working to improve the world right now is just-
Spencer Greenberg: So I don’t burn out.
Robert Wiblin: Yeah. So, one thing is yes, so you get these explanations like ’cause I don’t want to burn out, so it’ll be like I’ll be able to work even harder later on. Sometimes that’s true, sometimes it’s not. But every so often I hear people say I know, it’s because of weakness of will or it’s like, on some level I wish that I wasn’t doing this, but in practice I can’t actually make myself work all the time or I just actually just don’t care enough about the world to work that hard. But it sounds like you’ve seen a lot of people in your social group just try to trick themselves every time that it’s always for some greater good.
Spencer Greenberg: I definitely have seen this and I think that some of the EA community is especially susceptible to is this,not drawing clear distinctions between these like my intrinsic value,with what I think my intrinsic value should be, versus what I think the universal truth about intrinsic values are, and those are different things and you should understand that. And like one way to recast that, like, oh, I’m going to go on vacation, is like, oh I have an intrinsic value of my own happiness and that’s okay. There’s nothing wrong with that. Being in denial of that fact doesn’t make it go away. If you actually want to change your intrinsic values, you still need to know where you’re at and then think about what you want them to be, rather than pretend that they’re already there.
Robert Wiblin: Yeah, I mean, I would say I do as a brain, I value my own welfare more highly than other people, and I don’t think that that’s good; nonetheless, I make peace with that and pursue my own interests to a reasonable degree and I don’t feel bad about it ’cause I don’t think that’s gonna help.
Spencer Greenberg: Yeah, and personally, I think it’s healthy to say, “I have a bunch of intrinsic values. Some of them are about helping the world. Let me support those values by devoting a certain amount of my efforts to really doing the best I can at promoting that universal intrinsic value of reducing suffering or increasing well-being.” Or whatever it is. Then I also have these other intrinsic values which I balance against that, you know? So, I’m not willing to like necessarily, utterly destroy myself for my universal intrinsic value, but I’m willing to work really hard for a long time and make it a big part of my life and that kind of a thing. I think that’s the healthy way to balance it.
Robert Wiblin: Yeah, I agree. All right. What’s number five?
Spencer Greenberg: So, the fifth and final reason why I think it can be useful to understand your intrinsic values is I think it can help us when we’re thinking about the vision of the world we want to create, because it’s pretty easy to say like on the margin ways we can make the world better, like if there was less disease, less poverty, less suffering, I think most people would agree, that’s good. But when we actually start thinking about what world do we want to make. Like, in hundreds of years if humanity makes a new world, what do we want that world to look like? As soon as you start trying to describe that world, weird things happen. First of all, if that world is built on just like one or two intrinsic values, a lot of times, that world will sound unappealing, even to yourself. Even if you think that those are your only or prominent intrinsic values. Second of all, it will probably sound even less appealing to other people, and so-
Robert Wiblin: If you don’t share those values or weight them as highly.
Spencer Greenberg: Exactly, and so I think if we’re trying to make a world that like generally is broadly appealing and lots of people will want to be in and we ourselves even will want to really be in and then we’ll think of it as optimal. We have to really consider multiple intrinsic values and try to build a world that like has this complex set of intrinsic values that it supports. Otherwise, you’re building a world for just like a small subset of people. And a classic example of this is that if you’re hedonic utilitarian, you know, by some forms of logic people come to the conclusion that the best world is like just hook everyone up to a happiness machine, right? I think a lot of people even people who think of themselves as hedonic utilitarians actually don’t think that that’s the best world. Right, like even according to what they think of as their value system. Right?
Robert Wiblin: Yeah, I guess I do or I would put myself on a happiness machine if there wasn’t like anything useful that I could do and I would recommend that to other people.
Spencer Greenberg: What if some people didn’t want to be in it though?
Robert Wiblin: Yeah, so then I think I wouldn’t force them for like pragmatic reasons potentially.
Spencer Greenberg: But not because you value their preferences.
Robert Wiblin: No, that’s right. Yeah. I suppose moral uncertainty is another thing. I think the second thing that I care the most about after welfare is autonomy.
Spencer Greenberg: Well, then so autonomy is a reason not to push people to be in some situation they don’t want.
Robert Wiblin: Absolutely. I’d recommend it to them, but I wouldn’t ever require anyone to, but it’s interesting that if you do have this view that there are objectively valuable things, and that people could be mistaken about what’s valuable, then you might want to go for quite an extreme future or a future that’s not appealing to everyone. Then I guess in reality you’d want to compromise with everyone, because you don’t have like total power over the future and everyone gets more if you compromise rather than fight.
Spencer Greenberg: Absolutely. So there’s the pragmatic reasons to compromise so we can all work together. There’s also moral uncertainty reasons. Okay, so you think this is the only good thing, but are you sure?
Robert Wiblin: Yeah, right.
Spencer Greenberg: Philosophers don’t think so. I think, what, like a quarter of them are consequentials or something like that? I don’t remember the exact number.
Robert Wiblin: Yeah a quarter or a third or something like that.
Spencer Greenberg: Yeah, so there’s a lot of disagreement about these things. So, I think that’s a really good reason. So, compromise is a good reason to include multiple intrinsic values, moral uncertainty is a good reason to include multiple intrinsic values, but also, if you don’t believe in objective moral truth, yet you still claim that the only thing that matters is, let’s say utility, well, I’m a little confused. Because if you don’t think there’s objective moral truth, in what sense is that the only thing that matters? Well, you’re definitely not describing your brain, because that’s not the way our brains work. Human brains have multiple intrinsic values.
Robert Wiblin: So, it seems like there are some people who don’t believe there are objective answers to what’s intrinsically valuable. And it’s true that in their own life they will weight their own welfare more highly and things like that just because of, you know, like how humans are as a species, but then when it comes to like much more abstract cases or cases where kind of resources aren’t limiting or they’ve changed their own … they’ve like make their own life and their friends’ lives kind of go in the normal way off of humans which isn’t only considering welfare, then in a much broader picture, they’re willing to I guess work towards a much stranger world where you optimize only for particular things that I think are valuable in the abstract. I guess for those people it’s not clear that they’re making a mistake. It’s true, yeah, if they think about their own future, what future would they create for them personally. It would be probably a lot more human than like a happiness machine. But then when they’re thinking about you know, what should I turn rocks into it might look more like a happiness machine.
Spencer Greenberg: Yeah, well, you know the way I explain this is well, first of all, people can be convinced that certain things are objectively valuable. And if they’re convinced of that, they might try to maximize that thing. I worry about that, because I think there’s, for the reasons mentioned, that that’s potentially dangerous to just focus on one intrinsic value. But if they don’t believe in objective moral truth, then really, what are they doing? Like, in some sense what is there to do other than reflect on your own intrinsic values [in that case], because you don’t believe there’s anything out there, so all there is is to like examine what’s here in your mind. When you do that, I think that, and I think my data also hints at this, you get really a complex system of intrinsic values. There’s a lot of different things that people seem to value and they include things like people getting what they want, people not suffering, justice and all kinds of other things.
And I think you know so, not all of those intrinsic values will drive people to the same degree, right? One person may be more driven by their own happiness. Another person may be more driven by the happiness of their community and the third person might be more driven by the happiness of all beings, right? So, even if all three of those people actually had the same intrinsic values they might have different strengths of like how driven they are by those values.
Overconfidence
Robert Wiblin: Okay, let’s move on to another study that you’ve done about overconfidence this time. What was that about?
Spencer Greenberg: Yeah, so this is a really new result. We’ve been working on it for a while, but kind of just coming together in a final result. We’re looking at whether we can predict on what sort of skills people tend to be overconfident when rating themselves relative to others. And we’ve actually done a series of three studies to try to get this result. It’s pretty complicated to try to produce. In one of the studies, we had each person evaluate out of a hundred people of their own age and gender, who live in their area, how many of those hundred people do they think they’d be better than at some particular skill?
So we had them rate a hundred such skills and then we can look for each skill, what is the average rating people give? So on average, they think they’re better than 30 out of 100, that suggests that maybe people will be under-confident at that skill whereas on average they rate each other at 70 out of 100 maybe people are overconfident at that skill.
Robert Wiblin: Okay, so you were looking at in which cases they were overconfident and which cases they were underconfident?
Spencer Greenberg: Exactly, we want to be able to predict. And so I want to give you guys a little quiz now, I’m going to read you a few skills and these are some where we found pretty strong findings in either overconfidence or underconfidence, see if you can guess which ones are overconfident versus underconfident.
Robert Wiblin: I actually don’t know the result so I can actually guess blind, yeah.
Spencer Greenberg: Great, okay, you ready?
Robert Wiblin: Yeah, go for it.
Spencer Greenberg: All right. So “knitting a sweater”? People are overconfident or underconfident?
Robert Wiblin: Yeah, I’m going to guess that few people have done that and they’re going to think, “Well I don’t know anything about knitting,” and they’re probably going to be underconfident actually.
Spencer Greenberg: That’s right, we found that they were underconfident. How about “thinking critically”?
Robert Wiblin: I think that too many … almost everyone thinks that they’re more rational or more intelligent than other people so they’re going to be overconfident.
Spencer Greenberg: Yeah, we’re way overconfident so people rating themselves better than about 71 out of 100 people about that on average. Okay, how about “lifting 10 pounds”? So this was a bit vague, we didn’t specify what it means to be lifting 10 pounds, but “lifting 10 pounds”.
Robert Wiblin: I think people will be appropriately confident because they’re just going to be baffled by what this is.
Spencer Greenberg: People would pride themselves being better than 71 out of 100 people at lifting 10 pounds.
Robert Wiblin: What does that even mean?
Spencer Greenberg: There is one theory that when there’s ambiguity, people tend to use the ambiguity to their advantage to interpret it in a way that makes them look good. We didn’t necessarily find that as a result, but that is one theory that’s out there in the academic world about how people deal with ambiguity. Here’s another one, “running a marathon”.
Robert Wiblin: I think most people are going to imagine that running a marathon is very unpleasant and that they haven’t done it. On the other hand, they might think that everyone else is terrible at running marathons too. All right, appropriately confident?
Spencer Greenberg: We found people were underconfident.
Robert Wiblin: Under?
Spencer Greenberg: This is, I think an interesting topic, because a lot of people have heard that people are overconfident, people are overconfident, but actually it’s not true universally. There are things people seem to be underconfident in and so we did find that out of these 100 skills that people tended to be overconfident not underconfident, but there still were quite a few [where] people [were] underconfident and so we thought that was pretty interesting.
Robert Wiblin: Yeah was there any unifying theme for what things people were overconfident and underconfident about?
Spencer Greenberg: Yeah, so we actually looked at what traits of a skill are predictive of people rating themselves as being overconfident or underconfident in that skill and we did this by having people rate the specific … how many people out of 100 they’d be better than at the skill, but we also actually ran another study where we had people do this skill and so we could actually see whether they truly were overconfident or underconfident compared to their prediction.
So we analyzed all that data and we actually looked for variables about a skill that were predictive in both cases. In other words, when people were rating themselves in the abstract relative to other people, but also when they were actually then going to then do this skill and we could actually measure it. We actually were able to find five different traits about a skill that seemed predictive in both cases of whether someone’s going to be over or under confident.
Robert Wiblin: What were they?
Spencer Greenberg: So the first one is how good people think they are at the skill on average. So if it’s a skill that a lot of people say they’re good at, then people tend to be overconfident in our data. Our second one is whether people feel that the thing’s something that’s a matter of personal opinion. Like whether you’re good at it is a matter of personal opinion. So maybe writing a novel, how good you are might be a matter of personal opinion whereas maybe throwing a dart at a dart board, maybe there’s more objective matter of whether you’re hitting the dart board.
Robert Wiblin: In that case, actually everyone could say that they’re better than average and be right by their own lives because they have a different sense of what it is to be good at that particular skill.
Spencer Greenberg: Yeah and I think that’s right and that goes to that ambiguity point. When it’s ambiguous, maybe people are using that to say that they’re good at the thing. Third trait we found is how experienced people say they are on average doing the thing. Strangely, potentially, you might think it’s strange, as people reported being more experienced we found that they can be more overconfident in the thing.
Fourth, how much people say that the activity reflects someone’s personality or character. So maybe the skill is making friends and people might think that really has to do with your personality and character so they may be more overconfident. Whereas maybe rolling a bowling ball, maybe people think that’s less to do with your personality and character, maybe they’re not so overconfident.
Robert Wiblin: So if it’s about their personality, then they tend to be more overconfident. If they view it as more a core aspect of who they are.
Spencer Greenberg: Exactly. And finally, how difficult people think the thing is. So if they rate the thing as being very difficult, this is the one that goes in the reverse direction. If they think the thing is very difficult, they tend to be underconfident [crosstalk 00:39:59].
Robert Wiblin: Which kind of explains the marathon.
Spencer Greenberg: Yeah exactly. So if you think about the marathon one, people might think it’s difficult, which could cause them to be underconfident, they may not view it as part of their personality or character potentially, so that could explain why they might be underconfident. People are not experienced so that could explain why they’re underconfident. They might say it’s not a matter of personal opinion so that could explain why they’re underconfident and so on. So you see, the marathon one lines up pretty well with a bunch of these traits actually to explain why people might be underconfident.
Robert Wiblin: What about the knitting one?
Spencer Greenberg: Yeah it’s an interesting question. I don’t know. I’ve gone through and for a bunch of the ones that had the most extreme results, scored them on these five traits, and just from that very simple way of doing it, the five traits do a pretty good job at the extremes. I’m not going to say that if you know these five traits of the scale you’re going to be super accurate but they do seem to be significantly correlated with whether people are overconfident or underconfident from our data at least.
Robert Wiblin: I guess, the knitting one people don’t have much experience with it, maybe they regard it as kind of hard because they wouldn’t know how to begin and they also … presumably most people would not be knitting as a core part of their personality. So those things push in the direction of being underconfident.
Spencer Greenberg: Right.
Robert Wiblin: So there’s a pretty big literature on overconfidence and overplacement right?
Spencer Greenberg: There is.
Robert Wiblin: So how does this gel with the broader literature?
Spencer Greenberg: Well we started out looking at literature and trying to take traits that they found but we weren’t getting the R squared as we wanted it. So we weren’t getting … so in other words, we took a few traits from the literature that people had said were correlated with whether someone’s over or underconfident, and then we ran our first study and we just didn’t get the strong correlations we were hoping for. So then we went back and we just cast a really wide net. We tried to come up with all the traits we could through brainstorming, we actually crowdsourced other people’s ideas about what traits might predict over or underconfidence, we came up with a list of, I think, 21 different traits and then we just ran them all and tried to use large enough data sets that we could really afford to check all of them and that’s where we got this list of five. So that’s where it comes from.
Robert Wiblin: Must have had a huge sample to have enough … and also a lot of different tasks that … to get enough variety across these five characteristics.
Spencer Greenberg: Well yeah. So in one of our studies, we had 10 different tasks that people were doing and we actually had them do the task so we could measure how performant they were and we had them predict their performance relative to others before they did it. Yeah, we used large samples because it’s really hard to enter discussions about it.
Robert Wiblin: Yeah, what were some of the things you expected to have an effect but didn’t?
Spencer Greenberg: Well, it’s tough because a lot of the things are somewhat ambiguous. For example one of the things we expected to have an effect is how much it relates to your ego. We thought that if you would feel bad about not being good at it, maybe that would affect it and we just didn’t find a very strong effect there but then again, we did find that if it relates to your personality or character … so that’s kind of like being relative to your ego but not quite the same thing.
Robert Wiblin: Yeah, were you running a regression where you did all of these things at once and so maybe one of them was cannibalizing the effect of the other?
Spencer Greenberg: Exactly, maybe once you took into account personality and character, it kind of sapped the power out of ego or something like that. But these were the five … what we were looking for here is stability. We wanted to find the ones that through the different studies we ran, we got a consistent effect in the same direction, so it didn’t hinge on the details of the way we did the study and that’s how we came to these five. So we definitely wouldn’t claim these were the only five, you could also carve up the space of traits differently, right? For any given trait, you might well find other traits are correlated with it you could have used instead but these five were kind of robust ones that we were able to uncover.
Robert Wiblin: Yeah, so what are the broader lessons we could take away from this? I suppose one is people are always told that everyone’s overconfident all the time or that’s a simple story, but it’s much more complex than that.
Spencer Greenberg: Yeah, although I would say it is a general rule, it’s often true, there are domains that people tend to be underconfident in and these five traits can actually help you pick out what those domains might be. We’re exploring now building a little tool that you can put down a skill and we want to help you be able to actually make a prediction about whether people will be under or overconfident. So hopefully that will come out ClearerThinking.org in a few months.
Robert Wiblin: I guess there is this other concept of overconfident which I think is more technically called over precision, which is if you ask someone to say, “What’s the population of China? Give me a range that it’s 90% likely to fall into that range.” They tend to give too narrow a range. Have you studied that at all? Do you know what the results are there?
Spencer Greenberg: Yeah absolutely, people tend to give narrow ranges on these things and so that’s another form of overconfidence, it’s actually very confusing in the literature because people use these words interchangeably but they’re actually different. There’s a third type of overconfidence which is your absolute performance, not your relative performance to other people. When it comes to this kind of making a prediction interval of the likelihood of the number of people in China being between certain number and certain number, there that gets into the idea of calibration training which is that you can actually learn to make those interval estimates more accurate.
We’ve actually done a bunch of work on this. We’re working on a project where we’ll help train people on calibration and actually we made a tool before which you actually can find on ClearerThinking.org where we give you 30 things where half of them are common misconceptions and half of them are things that sound like common misconceptions but they’re actually true. For each of them you have to say whether it’s true or false and then you have to make a prediction how confident you are. At the end, we analyze it and give you an indication of whether you tend to be over or underconfident and teach you about your predictive capabilities.
Robert Wiblin: Yeah I did that one last year, it’s really fun to pick out which are the misconceptions and which are not and which are the fake ones that you guys have planted in there. Not to brag, I was pretty well calibrated.
Spencer Greenberg: Nicely done. Well you know, a funny thing about that, it was surprisingly hard to figure out what was true with regard to these common misconceptions and we actually made a few mistakes which we only learned over more than a year of people using the tool, we’d occasionally have someone who’s an expert in some really tiny domain be like, “Actually, I’m an Egyptologist and I know that this is not true.” So it took us probably a year to get all the bugs out of there. I think, fingers crossed, they’re all correct now but to me, that was really a lesson in how difficult it can be to figure out the truth about things and we would cite things from sources we thought were super reputable and it turned out they were citing someone, who was citing someone who was citing someone. You just go through the whole chain of citations and it bottoms out nowhere. You know?
Robert Wiblin: Yeah, I guess maybe if you’d had the right answers at the time, my calibration would have been much worse. I guess there is this phenomenon where you get myths and then you get myths about the myths so you get like a mythical refutation of it.
Spencer Greenberg: Oh, interesting.
Robert Wiblin: There’s a thing about spinach I think, like the amount of iron in spinach. So the original myth, I think is that there’s lots of iron in spinach and then there was this mythical refutation at how it was a decimal point error. So it’s like, oh no it’s a myth that spinach has lots of iron and the original one is false for a totally different reason, the decimal point thing is just also itself like an urban legend.
Spencer Greenberg: Oh my gosh, it’s terrible. Truth is really complicated to figure out and these are around things that people don’t generally have really strong burning opinion about, right?
Robert Wiblin: Yeah.
Spencer Greenberg: Think about how bad it is when people have a bias where they really want a certain answer to be right. It’s hard to figure out the truth even when people can be pretty dispassionate.
Robert Wiblin: Yeah I’ll see if I can find out the true story about iron in spinach. It does make you despair if we can’t even figure that one out, then what hope is there for anything else?
Spencer Greenberg: But on the plus side, putting it out there in the world and having people comment on it, I think it eventually debunks things and if you’re willing to listen to people’s opinions, you can make it incrementally more accurate.
Robert Wiblin: Yeah it’s interesting, I guess the internet has made it a lot easier to spread myths but also made it a lot easier, probably to correct them. Or you can find the person in the world who knows the most about this issue in Egypt and he can set you straight, whereas otherwise you just never have a hope of connecting with them.
Spencer Greenberg: Exactly.
Bayesian updating
Robert Wiblin: All right, so I guess this raises the general issue of how you can decide what’s evidence for a claim and what’s not and how much you should change your belief based on what you observe. You’ve spend a lot of time, I guess, thinking about this question of how to update accurately. Do you want to give an intro to this topic?
Spencer Greenberg: Absolutely, yeah, so this ties into the idea of Bayesianism which is probabilistic, mathematical theory of how much to change your beliefs based on evidence. The way I like to think about this is actually using an English language phrase that we like to call the Question of Evidence and we actually have a module on ClearerThinking.org that teaches you how to use the Question of Evidence to evaluate the strength of evidence. So if you’re interested, go check that out. But the way the Question of Evidence works is it asks the English language question, “how likely would I be to see this evidence if my hypothesis is true, compared to if it’s false?”
So let’s say if you got a three to one ratio, like you’re three times more likely to see this evidence if my hypothesis is true than if it’s false, that gives you moderate amount of evidence. If it’s 30 to one, you’re 30 times more likely to see this evidence if your hypothesis is true than if it’s false, that’s really strong evidence. If it’s just one, you’re as likely to see this evidence if your hypothesis is true than if it’s false, that’s no evidence, it actually doesn’t push you in anyway and then if it’s one in three, one third, then that pushes you in the opposite direction, it’s moderate evidence in the opposite direction. One in thirty would be strong evidence in the opposite direction.
So, I think what a lot of people don’t realize is all these equations and so on can be very confusing but there’s the English language sentence which is the only way to say how strong evidence is. That is the right sentence. Other sentences that sound similar, actually are not the right way to quantify evidence.
Robert Wiblin: Yeah, what other expressions do we use that you don’t like?
Spencer Greenberg: Well informally, people often think like, “Oh if this thing seems likely to occur if my hypothesis is true, then that’s strong evidence.” Well not necessarily, because again, you have to say-
Robert Wiblin: It’s the ratio.
Spencer Greenberg: It’s the ratio. How likely is this evidence to occur if my hypothesis is true compared to if it’s not true? So they might leave out the, ” … compared to if it’s not true.” So there’s a lot of ways that our brains can not quite use the right formulation of this and you have to kind of go back to that sentence and say, “Huh, okay but let me go back to the sentence and estimate that.” You’re generally not going to get a hard number. It’s not like you’re going to say, “The number’s actually 3.2,” but you often will have a gut feeling that, “Oh yeah, this evidence is actually quite a lot more likely if my hypothesis is true than if it’s not.”
Robert Wiblin: What’s that called?
Spencer Greenberg: So that’s called the Bayes factor. So the Bayes factor is the likelihood of seeing this evidence if the hypothesis is true divided by the likelihood of seeing this evidence if the hypothesis is not true. That quantifies the amount of evidence but then the question is what do you with the Bayes factor?
Well, it tells you how much evidence you have for or against something, for or against a hypothesis, then you have to think about, well what did I believe before that? That’s called your prior. So you kind of have this prior belief about how much more likely your hypothesis is than not your hypothesis and then you use the Bayes factor to get your new probability of how likely your hypothesis is relative to not your hypothesis.
Robert Wiblin: Okay, so you’ve got your original probability of the claim being true and then you multiply it by the probability of seeing the data if it is true over the probability of seeing that data if it was false?
Spencer Greenberg: Close. Okay, so it seems a little funny to work in terms of odds, but turns out that the math works much more nicely in terms of odds. That’s why we always do it relative. Like the probability of seeing this evidence the hypothesis is true, relative to the probability of evidence if the hypothesis is not true. That’s kind of an odds ratio of how much more likely something is than something else. Your prior is how much more likely is your hypothesis to be true versus not true before you looked at the evidence? So if you thought it was three times more likely that your hypothesis was true than that it wasn’t true before you saw the evidence, that was a three to one odds, now you get some evidence and you think that it’s 10 times more likely to see this evidence if your hypothesis is true than not true, then you’re going to take 10 times 3, now-
Robert Wiblin: It’s 30 to one.
Spencer Greenberg: 30 to one odds, right. So you start with certain odds and you’re adjusting your odds as you get evidence.
Robert Wiblin: Then you can convert that back to a probability if you like.
Spencer Greenberg: You can refer it back to probability if you like or you can just work with odds if you get comfortable doing that.
Robert Wiblin: Yeah, do you prefer using odds or percentages?
Spencer Greenberg: It’s much nicer using odds when you’re doing Bayesian updating, when you’re trying to figure out the strength of the evidence, because the formula works out really nicely. It’s simple multiplication.
Robert Wiblin: Yeah, okay. Do you want to go through a simple example? I guess other than the diagnostic test for breast cancer or something like that?
Spencer Greenberg: Yeah. The classic one. Is there a particular topic you want to do it around?
Robert Wiblin: Trying to think. Elections or soccer matches? I suppose the World Cup’s on right now, I’m tracking that a lot.
Spencer Greenberg: Oh no.
Robert Wiblin: Maybe I can walk you through it. You start out with two teams, maybe that are similarly matched and perhaps you originally think there’s a one in three chance that your favorite team will win, say, I guess … I don’t know, backing Spain or something like that. Okay, so the odds of winning there would then be one to two. And then say in the first minute, for simplicity, they score a goal. How might we update there? I guess then we need to look at … I guess it gets complicated pretty fast, right?
Spencer Greenberg: Right, so then the question you want to ask yourself is how much more likely would I be to see this evidence that they scored a goal if they are going to win compared to if they aren’t going to win and there it’s going to be a subjective assessment based on your intuition about soccer. Someone scoring a goal early in the game, is that much, much more likely to occur if the team ends up winning relative to if they don’t or only a little bit more likely to occur? Whatever, you probably have more soccer intuition than I do and this is where your experience is going to come in.
Robert Wiblin: Okay, interesting. Is it just subjective judgment at that point? I suppose you could have a more explicit model of a lot of the games and-
Spencer Greenberg: You could absolutely, yeah, if you had a data set of all the games that are played and let’s say they got a goal in the first 10 minutes, you could look at the data of all the times that one team gets a goal in the first 10 minutes before the other team scored, what percentage of the time did they beat that team, that could actually give you this evidence. You could literally do a calculation around it but if you’re just hanging out with your friends and you’re trying to see what’s the chance your team’s going to win that you prefer, you’re going to do a more subjective judgment of how strong is that evidence.
Robert Wiblin: Okay, so let’s say team scores in the first minutes, I’m going to say that I’d be three times as likely to see that if they are going to win as if they’re not, if that’s the right phrasing?
Spencer Greenberg: Yeah.
Robert Wiblin: So it’d be a likelihood ratio then of three to one times by one to two-
Spencer Greenberg: Right because the prior odds were one to two that they would win because out of the three possibilities, only one of them involves them winning, right?
Robert Wiblin: Mm-hmm (affirmative).
Spencer Greenberg: There’s one to two, exactly and then you multiply that by the Bayes factor, which is the new evidence.
Robert Wiblin: So then I’ve got three over two, so we’ve gone from a 30% chance of winning, or 33% chance of winning to a 60% chance of winning based on that goal?
Spencer Greenberg: Yeah.
Robert Wiblin: Yeah, okay. I’ve been aware of Bayes rule for ages, but because I’ve always been thinking in terms of probabilities, I’ve found it very hard to apply it, or I’ve been able to do it reasonably when the odds are close to 50/50 because then you don’t run up against this upper bound or lower bound of zero or 100%. But yeah, with the odds I can do it, when it’s at the extremes as well.
Spencer Greenberg: Yeah, the odds way of thinking about it, although it’s a little annoying to convert between odds and probabilities and you have to wrap your mind around that, it turns out to just be much simpler to think about. So I always like to think of it in odds but I’ll tell you the way that I use the most in life is I have some belief about a thing, I get some evidence and I want to know, “Wait a minute, is this a really … ” Let’s say it’s evidence against my belief and I want to say to myself, “Is this weak evidence or is this strong evidence, or is this kind of moderate?” I like to do that double check of, “Well how likely would I be to get this evidence if my hypothesis was true relative to if it’s not, get a kind of intuitive feeling, is it 30 to 1, is it 3 to 1, is it more like 1 to 1?” And then that sort of tells me intuitively should I downgrade my belief in the theory or not and how much I should downgrade it.
So it’s not so much I’m doing an explicit calculation but it’s really interesting to see that surprisingly often, your brain won’t necessarily automatically do the right thing. Your brain won’t necessarily automatically realize that, “Oh this is strong evidence,” or, “This is only very weak evidence.”
Robert Wiblin: Okay so what are some takeaways from this?
Spencer Greenberg: So this way of thinking, this Bayesian way of thinking is the formal mathematically correct way of thinking about how to adjust probabilities when you get evidence. So it’s interesting to say, “Well what does it tell us about how evidence works?” One thing it tells us is that evidence is always probabilistic, we start with some probabilistic belief about the world like a hypothesis with some amount of probability we assign to it and then we have to adjust that probability. And I think this lesson, while very basic is one of the most important lessons, that everyone of your beliefs could turn out to be false.
Now maybe some of them, you’re 99.99% confident, but that’s not the same as being 100% confident. In fact, one of the things that Bayesianism tells us, this idea of multiplying by the Bayes factor, is that you can never get to 100%. So if you start with like a three to one odds and you start accumulating evidence, you’re going to keep multiplying by different numbers, but it’s never going to go to infinity. You’re never going to get an odd of infinity to one. In other words … and that’s equivalent to saying, “You’re never going to get 100% belief.” So unless you started with 100% belief, somehow when you were born, if you’re a Bayesian updater, you’re never going to end up with 100% belief. So I think this is also an important lesson.
Another lesson I think we can take away from this that’s very important is that if you ignore small amounts of evidence, it can really lead you in the wrong direction. So say you’re very confident, you’re 95% confident something’s true and then you get a trickle of evidence that’s slightly against it, so it says it’s not true, and then you get another trickle of evidence, another trickle of evidence, all saying it’s not true, but none of them are that strong. Well, if you don’t update each time, then you could just say, “Oh well I’m 95% confident, so I’m almost sure this thing’s true and this evidence is very weak so I’m just going to throw it away.” Then the next evidence comes in you’re like, “Oh this is very weak,” and you throw it away. But if you throw it away enough times, actually it could turn out you should now believe the opposite, but because you kept dismissing the evidence because no single packet of evidence was so strong to change your mind, well now you end up with the wrong belief.
So one thing we learn from this is that really the way evidence should work is that it’s smooth, gradually adjusting all the time. We’re getting a little more confident we get evidence in favor, we’re getting a little bit less confident we get evidence against.
Robert Wiblin: Interesting. So you mentioned earlier that often when people present evidence against your view, you don’t automatically figure out what is the right Bayes factor. Are there any systematic errors that you think you’re making there or is it just that you don’t really intuitively … or humans don’t instinctively think in terms of what is the likelihood of seeing this evidence if it’s true versus if it’s false?
Spencer Greenberg: Yeah, one area we talked about is they might say, “Oh this evidence is likely if my hypothesis is true,” but they also don’t realize it also might be likely if the hypothesis is not true. Another thing is that something people will get fairly weak evidence and they interpret it as much stronger evidence than it really is. So they’re looking at how compatible that evidence is or something, but they’re not actually evaluating the formal strength of the evidence, which is what the Bayes factor tells you. So it can be very confusing.
Another error people make is they don’t take into account the prior probability. So they’re just evaluating the evidence as though they didn’t know anything previously. Example of this that I see my own brain do a lot is let’s say I’m traveling and I’m walking in a foreign city and I see someone that vaguely looks like a friend of mine. My brain will immediately be like, “Oh that’s so and so.” But then I do this mental correction of being like, “Okay, but the prior probability of so and so being in this random city that I’m in right now is so low, even if that looks quite a bit like the person, it’s probably not them,” unless it really looks exactly like them then okay, it probably actually is them.
Robert Wiblin: Whereas if you saw the same thing in the city that you both live, then you’d be much more likely to think it was them.
Spencer Greenberg: Exactly, much more likely.
Robert Wiblin: Because it’s much more likely to begin with.
Okay so this raises a whole bunch of issues that hopefully we’ll be able to get to over the next 40 or 50 minutes. So we’ve discussed here priors, so what’s your pre-existing belief, and then how to update. But we can discuss how often should you stick to common sense, so when should you really believe your prior, or put a lot of weight on it, versus updating based on the things that you see? There’s where does this prior come from? So what class of things should you be considering and including when you’re trying to assess, I guess the base rate of the thing that you’re evaluating actually occurring?
Then there’s how much weight to give to explicit models and theories that people present to you? So if they have some kind of micro-economic model of how something works, under what circumstances should you update a lot versus a little? Then there’s direct empirical evidence, so if you get a new study on nutrition that says that some food is especially healthy or unhealthy, how much do you update on that? Then there’s other kinds of evidence like heuristics that we use, which is a lot of rules of thumb that seem to guide people well in general, even if they’re not explicitly quantitative in form.
Then often at the end, I think, when we’re evaluating a lot of evidence together, or there’s been a particularly strong argument that’s tried to move our belief a lot in one direction, I imagine both of us do error checks so we think, “Well if this was true, what would that imply, and is it showing something else that we don’t observe that contradicts then or that suggests that we’ve made a mistake?” Maybe we can take some of these things in turn.
Spencer Greenberg: Sounds great.
Robert Wiblin: All right, so something I have a big interest in is when should we trust common sense and give a lot of weight to our priors and when should we not? Do you have much of a view on this?
Spencer Greenberg: I think it’s important to note that we all intuitively form beliefs all the time, based on just the things we see around us, and that makes a lot of sense. When we have a direct perception of the world and that gives us a ton of evidence. Where it gets a little sketchier is when there’s things that we’re not directly witnessing over and over and over again. So it’s either something that we witness rarely and we try to make some inference from it, or when it’s something we don’t witness at all and we just hear it through someone else. It’s filtered through other people. Those are the times when our intuitive beliefs start getting a little hairy and we might begin to doubt them.
When it comes to common sense, there’s certain kinds of things you would expect common sense to be effective at. Maybe people’s common sense about how to stay safe, you might expect that humans are pretty good at that sort of thing. But people’s common sense about really difficult philosophical problems, or difficult problems in computer science or something like that, if they’re not someone who’s trained specifically in that thing, I don’t know why you would expect common sense to be particularly good at those kinds of problems.
Robert Wiblin: Yeah, I think that that’s how I think about it, is looking at different fields and thinking, “Do humans get feedback on whether they’re right about these questions?” Like feedback personally and also feedback, potentially through evolution that humans who got these things wrong in the past tended to die more frequently, or at least find out about it and correct it culturally.
Spencer Greenberg: That’s a really excellent point, the feedback issue, and it’s actually really interesting to think about, when can a human learn to do something? Another way to rephrase this is when can we trust our intuition? If we’re in a situation where we do something over and over again, there’s some variability in the domain, but not too much, we get feedback on how we did, but the feedback’s not too noisy and we get a lot of repetition, we can learn to predict all sorts of things accurately.
For example, imagine you’re a psychologist, you see patients every day, one thing you get feedback on is whether the patient gets upset in your office. So you could imagine a therapist actually getting really good at preempting, “Oh this patient seems like they’re about to get really upset.” Maybe getting really good at predicting what they can do to help a patient calm down, because they have very rapid feedback on that. But they might have much less rapid feedback on whether the person’s doing well a year later, or something like that. So that’s a much less tight feedback loop, that might be harder to build their intuition and there they might have to rely more on studies or theories rather than just what their gut tells them about what’s going to make this patient better in a year.
Robert Wiblin: Yeah, okay so how about I rattle off different kinds of questions and you can tell me whether you think you should give a lot of weight to your prior or not? Okay so you’re talking to someone and you’re trying to evaluate how they feel about you, or whether they’re happy or not.
Spencer Greenberg: Yeah, I think here people actually differ a lot in their ability to make this kind of prediction. So really you have to have some self-awareness and say, “Am I the sort of person that’s good at reading social information, or maybe I’m not so good at it?” There are some people that are incredible at this, where they can really read things and people’s faces and people are like, “What? How did you know that I’m feeling that way?” Subtle emotions. Other people are just really not good at it and they can’t read even emotions that would often be obvious to others. So there, a lot of variability individually.
Robert Wiblin: Yeah, I guess it’s true that there’s a huge amount of variability here which is kind of interesting because you’d think evolution would push us pretty strongly to being good at this because people who made big social mistakes, or couldn’t read other people would have been at a big disadvantage in social situations in the ancestral environment.
Spencer Greenberg: You’d think so. It could be that some ways of being low social skill could be associated with other positive benefits, it’s a possibility, but it’s tough to say for sure.
Robert Wiblin: It’s also just a super difficult problem. Which could explain, yeah, why people make mistakes. I guess the people who are very good at this are reading tiny cues, computers can’t do this at this point even though they’re able to see things and replay them and analyze them in great detail.
Spencer Greenberg: Although increasingly, machine learning algorithms are beginning to be able to read facial emotions. So for example, I saw this machine learning system where it had people watch advertisements and then machine learning would try to actually measure their emotion throughout the ad and then see, “Oh, people are getting excited and now they’re feeling surprised,” or whatever, that kind of thing.
Robert Wiblin: Each I expect they’re going to be beating us pretty soon. All right, philosophy.
Spencer Greenberg: So I think philosophy is some of the hardest stuff out there and our intuitions are just not well honed. Often philosophers use intuition in their arguments, this is sort of an interesting debate. A lot of philosophers will acknowledge this and say, “Yes, intuition is part of our argumentation technique.” Other philosophers, I think a minority of them, deny this and say actually they don’t use intuition. Then there’s an interesting debate of is a philosopher’s intuition about philosophical problems actually well-honed?
I think that you can go back and forth on this and one reason potentially though to doubt whether you can have a well-honed intuition on these kind of things is it’s not clear we ever find out the right answer. Sometimes we find out this thing’s the wrong answer because we find it’s inconsistent, or there’s a problem, but rarely do we like, “Oh yeah, we solved that part of philosophy, now we know that we were right on that.”
That being said, some of philosophy is really about trying to figure out things like your own values or what things mean to you and there, you really are reflecting on your own internal systems and if you’re doing that, if you’re trying to … like our discussion of intrinsic values, if you consider that part of philosophy as figuring out your intrinsic values, well you have no choice but to use your intuition because your intuition is the system that tells you what you value. So you’re out of luck figuring it out another way.
Robert Wiblin: Yeah, okay, cooking?
Spencer Greenberg: As someone who’s a terrible cook, I don’t trust my intuition in cooking at all, but when you cook a lot, you certainly develop intuition, no question. You watch people who are really good chefs, they’re not reading the recipe, they’re just like, “Oh I think it needs a little more of this, a little more of that,” and they’re probably right because they’ve cooked so many recipes and tested them throughout the process of cooking them, that their gut system is very, very good.
Robert Wiblin: Okay, macroeconomics.
Spencer Greenberg: Oh, so macroeconomics is really tricky because I think a lot of it is actually very counterintuitive. Where people will kind of expect a certain thing to happen and because of weird second order effects or because of the way incentives work or just subtle things about supply and demand, it won’t work the way you expect. So I think actually a lot of times, our intuitions are just not useful in that domain.
Robert Wiblin: Yeah, I think they’re actually worse than random in that case, which is interesting.
Spencer Greenberg: They might be. There seems to be all these surprising ways where the first order effect is a certain way and so we just have a really hard time believing that that’s not the way that the final effect goes. You know what I mean?
Robert Wiblin: Yeah, oh you mean like on a small scale it’s one way and on a big scale, it’s actually almost the reverse?
Spencer Greenberg: Exactly, it’s like, “Well if you pay one person more money, clearly that’s good for them, so why don’t we force all companies to pay people more money?” Then you’re like, “Wait, but if every company was forced to pay everyone more money-”
Robert Wiblin: Wouldn’t that just raise the prices of everything and then it cancels out?
Spencer Greenberg: Yeah so you get this really counterintuitive effects when you start trying to take your local intuition about everyday life and globalize it to a whole economy.
Robert Wiblin: Yeah, I think it’s often called the fallacy of aggregation. All right, what’s a different domain? Human social psychology, the kind of thing that you’re studying.
Spencer Greenberg: So human social psychology, that’s a really interesting one because I think for a lot of us, we actually have pretty good intuition about a bunch of things about psychology. Certainly not all things, but a bunch of things. Most people could tell you, if you described a story, they could tell you will someone be sad if that happened to them, or angry and people will be pretty good at that. There’s many social things about the way people relate to each other where we’re pretty good predictors.
So that actually raises the bar pretty high because if you’re a psychologist trying to discover some new thing about psychology, you’re competing against people’s pretty well-honed, intuitive psychology detectors that they have really … Not only are they pretty well-honed, but they are getting feedback all the time. I’m like, “Oh, I mis-predicted my friend, and now my friend’s angry at me,” and that kind of thing. That being said, there certainly are some findings in psychology that people would not have predicted that you wouldn’t just expect them to be true automatically.
Robert Wiblin: Yeah. I guess it seems like pretty often those things … They’re not replicating super well, so I wonder whether in fact our intuition was better when we … A lot of these psychology results have got a lot of attention. They got a lot of attention because they were surprising, they were against our intuition. It’s like subtle things about the environment can change our behavior a lot. In as much as they are not being replicated, maybe actually we just had a good sense that actually, no those things don’t matter so much to begin with.
Spencer Greenberg: Yeah. It’s certainly true that a bunch of findings haven’t replicated and a lot of people were really surprised by. There’s one that I want to talk to you about in particular which is that of power posing.
Robert Wiblin: Okay. Yeah, yeah.
Spencer Greenberg: So as many of you may know, there was a really famous Ted Talk, I think it was one of the most viewed Ted Talks of all time about this idea of power posing that adopting certain postures can make you feel more powerful. And you know, so imagine the posture that Superman might adopt, or that kind of thing. And what happened is there was a bunch of critiques of that research that came out, and then people tried to actually replicate the study.
As far as I know, I think there were six pre registered replications where people trying to replicate it in advance said, “Here’s the method we’re going to do, here’s the process,” then they went and did it, and they tried to replicate it. And I believe, if I recall correctly that by the standard way of deciding whether P less than 0.05 statistical significance, I think four of them did not replicate even the effects of people feeling more powerful, and I think two of them did replicate the effects of people feeling more powerful.
But, in the original study, they claimed not that just that it makes you feel more powerful. They’d also claimed that it then changes your cortisol levels, it changes your risk-taking behavior and so on. And those, as far as I know, those effects really did not replicate. And so there’s this really interesting thing that’s happened. Where all these people now have come and attacked to original research saying there were flaws in the original research. “This stuff doesn’t replicate. Power posing is fake, stop doing it before you go on stage or before a meeting.” But the irony to me, is I actually think that power posing works.
Robert Wiblin: Oh wow, okay.
Spencer Greenberg: And I’ll tell you why I think that.
Robert Wiblin: Yeah, hit me.
Spencer Greenberg: First of all, I happen to be a person that I think is very affected by my body posture. And so when I change postures, I actually can notice, like a fairly palpable effect. If there’s a large change in posture. So, it was very strange to me when the power posing stuff didn’t replicate that I’m like, “Wait, but I can just literally do an experiment on myself where I change my posture, I change to another, I feel an effect. I change to another.” So as someone who directly perceives that effect, I find it very strange.
That being said, attempting to be a good skeptic, I don’t necessarily, maybe I’m diluting myself, maybe I’m confused. So I went and ran a study that is the size of all the preregistered trials put together, n=1000. I preregistered it as well. And I tried to see whether people have a mood effect. Does power posing increase your mood? And so we’re still working on analyzing all the results, but the top line result is that, yes, we found a mood effect. Doing power poses seems to increase people’s mood. Seems to also increase their feelings of power.
There’s a data scientist who I’m friends with, who actually said he would go re-analyze the results, see if he agrees with us. So he’s checking them, see if he thinks we did the analysis properly. But the combination of this data, plus my first hand experience just alternating between different poses, really suggests to me that actually power posing might work. And maybe the critiques were accurate in the sense that they were accurately finding flaws in the original research, but maybe they actually misled people into thinking that this method doesn’t actually work.
Robert Wiblin: So why do you think those replications mostly didn’t find these effects that you’re finding? Are you measuring something different?
Spencer Greenberg: Here’s a really interesting thing. So, as far as I know, there were six preregistered replications. Two of them found an effect at P less than 0.05, four of them didn’t. Now, is that the pattern you’d expect if power posing didn’t work?
Robert Wiblin: Maybe it has like small effects? Is that the answer?
Spencer Greenberg: That’s what I think is going on. I think that what’s happening is power posing has small effects. It’s subtle. It’s not like profound change your life. It’s a subtle effect. And so I think what’s happening is that this pattern of replicate, didn’t replicate, replicate, didn’t replicate. To me that suggests a sort of relatively small effect. Those studies weren’t big enough to reliably detect it.
And actually, someone went out and did a Bayesian meta analysis, trying to combine all of the evidence from the six studies, and they concluded that it does actually have the effect. So I’m not the only one that thinks that. Now, here’s another thing about this. I suspect that people vary a lot on this dimension of how much body posture affects their mood. And so basically, what I suspect is that for some people, it actually has no effect on them, other people it’s kind of like a really small effect, and then some people it’s actually quite a large effect. And I think I’m in the kind of large, tend to be a larger effect group.
And so maybe that’s probably also why this is confusing because some people are utterly convinced by their own experience that this is totally useless, and other people are like, “What are you talking about? I can do this, and I actually feel a mood boost that seems significant to me.”
Robert Wiblin: Yeah, I was going to say my prior on this being true was pretty low. And then I think even when that study came out, because of all of the publication bias, it wouldn’t have updated me very much in favor of it. Maybe I think there’s a 5% chance that this is true. No, that’s a bit unfair. Start with a 10% chance, then the study comes out, then I inch up to 15% or something like that.
Spencer Greenberg: So what’s your update on the six pre-registrations, two of which seemed to find a statistically significant fact, and then my N equals 1000 study that found an effect? Apply Bayesian updating.
Robert Wiblin: Yeah. I think I would’ve moved upwards on it having some effect, like a non-zero effect. But maybe it’s shrunk on it having a very large effect. We’re kind of narrowing it down to something that’s a bit above zero but not a lot.
Spencer Greenberg: Yeah. And I think that’s a very reasonable way to look at it. Now, if I’m taking the devil’s advocate perspective you could say, “But maybe it’s a placebo effect.” And there it’s actually really interesting. It’s almost philosophical. What exactly do we mean by the placebo effect?
Robert Wiblin: The whole thing we were going for kind of is a placebo effect, right?
Spencer Greenberg: Yeah, in a certain sense. If we care about making people feel more powerful, the placebo is actually one mechanism by which it could do that. It wouldn’t imply that it’s not useful, it would just imply that the mechanism is believing that you’re gonna feel more powerful makes you feel more powerful. What would be really bad is if it was actually a reporting bias, in fact. In other words, people actually don’t feel more powerful but for some reason they report feeling more powerful when they’re in that posture. Because then you wouldn’t actually be producing the effect at all.
And, if I’m playing devil’s advocate against our own research, when we looked at people … At the end of the study we asked people, “Do you believe that body posture can affect mood?” And for people that said No, they got a much weaker effect than the people who said that Yes. But-
Robert Wiblin: But maybe they just know themselves.
Spencer Greenberg: Exactly. Maybe this really is a trait that there’s a high degree of variability and people whose body posture does affect mood have at some point in their life realized that. So, they’re like, “Yeah, it affects mood,” and then it also does actually affect their mood. It’s interesting to get in the weeds of this for continued analysis. I’m looking forward to getting this research out there.
Reference class forecasting
Robert Wiblin: Okay. A way that you can try to be more robust in coming up with your prior, or I guess in a sense of dating a prior other than just applying common sense is reference class forecasting. Looking at similar cases and seeing how often something is true, on average. Do you want to describe the instances of that?
Spencer Greenberg: Yeah, absolutely. Suppose you’re trying to decide: Is my friend who I invited to dinner gonna flake out on me? You could imagine actually if you were really bored writing down the last 20 times you saw your friend and which ones of those times they flaked. And then you could use that and say, “Okay. Well, three out of the 20 times, they flaked,” then you can calculate a probability of flaking from that. The thing that’s cool about that, despite it being kind of laborious and boring, is that there’s some evidence it produces better estimates than our intuitive judgment of, “Would this person flake?”
Part of the problem with our intuitive judgment is it can be affected by a lot of things that are not that relevant, like recency. It is true that the most recent time you saw your friend is probably the most relevant. It’s not necessarily way more relevant than three times ago, yet the most recent time might stick in memory a lot more and you might kind of overweight it intuitively relative to the three times ago in terms of how this person is to flake.
Also, just reference class forecasting in general, it’s a nice way to get that initial prior. So, maybe you say, “Okay, three out of 20. That’s my prior probability. But I have reason to think that my friend is less likely to flake this time for such-and-such reasons, so now I’m gonna adjust that. I’m gonna reduce down the chances of them flaking.” So you can start with this prior and you update on the other things you know. These other pieces of evidence you haven’t yet used in producing your prior.
Robert Wiblin: Yeah. So the standard story … I guess I’ve been reading Kahneman’s ‘Thinking, Fast and Slow’ … Is that, since the reference class is an outside view. So you’re taking a broader picture that’s a bit less personal of the situation. Then you’ve got the inside view, which is your personal perspective on it or you’re view on this specific instance. So here the outside view would just be … I guess the most outside view would be how often do people show up to dinner in general across everyone in the population. Then you can narrow it down to people you know. How often do all of your friends in general show up? Then you got your specific friend is getting more narrow than the most outside view. Then you could say, “What about my friend in the last year is getting more …” It’s a reference class that’s even more constrained.
Spencer Greenberg: We think of this in mathematics as a bias versus variance trade-off. As you zoom in on more and more relevant subset of the data like, “Oh, this particular friend just in the last year,” what happens is it gets less biased, because it’s really about the thing you care about but there’s higher variation because you have less data. Maybe you only saw the friend six times in the last year and you only have six data points. That’s not very many data points. You could broaden that to the last two years and now maybe you have 14 data points, but maybe your friend has kind of changed their personal habits. So the further you go back, you get more data so you have less variation but you have more bias in the data set. So actually, ideally if you were kind of a perfectly rational agent, you’d be trying to find the point at which … Optimize the trade-off between bias and variance.
Robert Wiblin: I’ve heard this called the Reference Class problem of like, “Well what reference classes should you look at and how should you weight them?” Do you have any advice on how you should weight them?
Spencer Greenberg: Well, I think that that is kind of the way to think about it. How much bias am I getting versus how much variability? If you have a very small data set, you’re probably willing to add some bias to get more. If you only have five data points, maybe you wanna add bias. If that’s gonna get you up to 30 data points that’s gonna be a lot better. If you already have 300 data points you probably don’t wanna add more bias because it’s probably already plenty of data points and adding more bias is just gonna make your solution worse. That’s if you really only wanna pick one reference class.
Another approach is that you could say, “Let’s produce multiple reference classes and then we’re gonna weight them by how relevant they are.” An example of this is: let’s say you’re running two advertisements for a product and you wanna know, “Did advertisement A work better than advertisement B?” Well one way to look at that is how many clicks on each ad occurred. That is good because you have a lot of data but it’s bad because someone clicking doesn’t mean they’re gonna go buy your product. You could also look at how many people actually bought the product. Well that’s good because it’s what you care about but it’s bad because you only have a tiny fraction of the amount of data.
So one way to think about this in that case would be to take … Let’s say you were able to calculate: on average, every 50 clicks lead to one purchase. One way to think about that, heuristically or intuitively, is you could say, “If an ad got a purchase we’d count that as being worth 50 clicks.” So we could take the number of clicks plus the number of pseudo-clicks, so 50 times the number of purchases and that’s a score ad one and then we could do the same for ad two. So that’s a way of combining different reference classes, essentially, and weighing their different strengths and weaknesses.
Robert Wiblin: Yeah. Just coming back to the flaking example. Let’s say that you’d met your friend for dinner 20 times ever and they’d shown up every single time, so zero percent flake rate-
Spencer Greenberg: Clearly then it’s impossible for them to ever flake in the future.
Robert Wiblin: Right. Exactly. So that’s obviously not right and I think that the reason is that there are other reference classes like how often do people flake in general; that should get some weight in there as well. So maybe you say one reference class has 100% chance they’re gonna show up but then, out of people as a whole, 20% of the time they flake and maybe you wanna take something that’s in between those two.
Spencer Greenberg: Yup. That makes a lot of sense.
Robert Wiblin: Perhaps the most famous example of reference class forecasting is with people predicting how long it’s gonna take them to finish things. If you can describe that?
Spencer Greenberg: Yeah. So very often when people are doing long, complex projects, they underestimate how long they’ll take. They’ll also underestimate often how much it will cost, how many researchers they’ll use. There’s a bunch of theories around why this is. It’s commonly called the Planning Fallacy. One theory is that, when you’re thinking about a long, complex project, you know that on some level that some things will go wrong, but it’s very hard to know what will go wrong. It’s gonna be sort of idiosyncratic. So your brain kind of smooths over and says, “Well this thing is probably not gonna go wrong and that thing’s probably gonna go wrong,” and so each individual thing, you kind of assume it’s gonna go right. But of course, there’s a good chance something will go wrong that you never even thought of.
Robert Wiblin: Yeah. I think trying to estimate how long this would take is a case where basically only use the outside view and then try to just completely ignore the inside view because it just seems so unreliable and biased.
Spencer Greenberg: I think one way to use the inside view and if you’re thinking about planning a big project you kind of use your inside view to try to break the project down into as many pieces as possible. You say, “Given what I know about this project, I think it’s gonna involve this and this and this and this,” and once you’ve kind of decomposed it in these small pieces which are easier to estimate, you could use, for example, reference class forecasting on the individual pieces which I think potentially might be more reliable, unless you have a really good reference class for the whole thing.
If you’ve done projects very much like this one many, many times, great. But very often that’s not the case. If you’re writing your second book, you may only have written one book. That’s not … Sure that’s better than zero books, but it’s not necessarily a wonderful reference class. Maybe this book will take three times longer because it’s something about this book.
Robert Wiblin: Yeah. In that case, you could probably look at how often do people finish books in general. Yeah, you’d give a bit of weight for all these things, so I guess you-
Spencer Greenberg: Exactly.
Robert Wiblin: … probably you should have originally started that with how long does it take people in general and how often do they finish, and then you update a bit that you’re maybe better than average if you did manage to finish your first book and do it on time.
Spencer Greenberg: Kind of updating that prior. We actually have a module on clearerthinking.org about how to help fight the Planning Fallacy and we actually teach you reference class forecasting. So if you’re interested, then check that out.
Robert Wiblin: Yeah. This kind of outside view stuff can make you a little bit cynical … Maybe cynical is not quite the right word. Pessimistic? You know that time? Because people often have so many delusions about how their own life is gonna go that it’s just extremely atypical. They tend to think that their relationships are gonna go on forever, they think they’re probably gonna get the jobs that they’re applying for, at least some of the time. They think they’re gonna finish their tasks. I do often hear people say, “Oh yeah. I’m planning to do this thing during my summer,” and I’m like, “No you’re not.” I don’t always say that. I don’t always say that. But I’m thinking in my head … Most people who are doing something that ambitious over summer don’t finish, so I think typically, typically you’re gonna be less or the same as they are.
Spencer Greenberg: Yeah. I think it can be demoralizing sometimes if people take the outside view. Let’s say you’re running a startup and you say, “Well, if nine out of 10 startups fail, why do I think mine could possibly succeed?” And I think one way I like to think about that is reframing it from a single, discrete thing to a process you’re running. You’re like, “Okay. Yes. This exact incarnation of this thing I’m doing has a high probability of not working. But if I’m willing to stick with this for years and iterate and learn and get better each time and pivot as needed,” that whole process has a much higher likelihood of success. Taking this kind of bigger view that it’s not like the world ends if you don’t do this exact thing that you’re trying to do.
Specific kinds of evidence
Robert Wiblin: Yeah. Okay, so let’s move on from reference class forecasting to thinking about more specific kinds of evidence. One kind of evidence that people are often confronted with is explicit models or theories that people have about how the world works. For example, I studied economics and you’d be presented with a specific model of say Supply & Demand or asymmetric information that people are claiming applies to a specific case. In what situations do you think people should update a lot based on models and theories, and in what cases should they be more skeptical?
Spencer Greenberg: Well I think it’s really important to understand, as the famous quote says, “All models are wrong. Some models are useful.” And every model is limited in the sense that it will sometimes make missed predictions. So it’s very useful to go out and absorb a bunch of different models about the way the world works, but you also have to absorb to some degree what domain the thing applies. If you think about, for example, Newton’s equations of physics, those are really accurate in a lot of cases. But there are some cases they are actually gonna really mis-predict what is happening. So it’s really useful to learn about those but just also know that, if you’re in a certain situation, it’s not gonna work as well.
Another thing I think about models, that can be really valuable to think about, that people underestimate, is that trying to form your own models is really valuable, even if those models will sometimes, maybe often, be wrong. Because what happens is … So, you have some experiences. If you don’t try to form a model, then you may have trouble ever generalizing to a lot of different cases. But if you force yourself to try to think of a model like, “Okay. What’s my understanding of why this thing keeps happening? Or what’s my understanding of this situation and how do I predict these situations in the future?” Then, suddenly, you have something that can be refuted. Now, first of all, when you go into the situation again, you have a prediction because you have a model that you can draw on, a kind of explicit theory.
You can also then say, “Oh. I mispredicted that.” If it was a really bad mis-prediction, maybe I should start thinking about whether there’s a flaw in the model. Maybe I can make it better. Once you actually force yourself to have a model, you can start proving yourself wrong and you can start improving that model and making it better. What’s really cool about that is then you can actually teach it to others. If you only have an intuitive model, it’s very, very hard to give it to anyone else. But once you have it more explicit, you can share it with the world and help other people have more accurate predictions.
Robert Wiblin: Yeah. So a common issue that people raise with models and theories and calculations that people do is that they can often lead to quite extreme claims that then rely on the model being true. And I know a lot of people in effective altruism that they tend to be very skeptical of any claim that relies specifically or especially on one model or theory of the world.
On the other hand, there is the risk that, if that theory is correct and it makes a very strong claim or says that one thing is especially important, that if you’re just generically skeptical all of them and you don’t like updating best on a single piece of evidence, that you could miss out on some really big opportunities. What do you think about that whole, general debate?
Spencer Greenberg: Yeah. I generally think it’s a good idea to try to rely on multiple models when possible. For example, imagine it’s a situation where you have a bunch of people independently making a forecast for a thing. It’s generally found that averaging their predictions together tends to outperform most of the individuals in the group. This makes perfect sense because not knowing who’s better, if we kind of a priori assume each of them is equally likely to be as good as all the others, why would we pick one person’s over the others? But if they’re very independent or somewhat independent predictions, they might have some biases that go in opposite directions. So by averaging them together you kind of maybe cancel out some of that bias and you also reduce noise potentially, by averaging them together.
I think the same thing goes for models. If you have multiple models that make a prediction about a thing, why not look at what each of them predicts and then you can start thinking about averaging their predictions or at least asking, “Well, in this situation do I have some reason to think that one model would be more accurate than another?” Rather than just relying on kind of one model, I think which tends to produce very bad predictions.
Robert Wiblin: Okay. So then probably the other key piece of evidence that people get is empirical measurements. So perhaps you might see a study that claims that chocolate is particularly good for your cardiovascular health or something like that. Do you have any comments on that kind of evidence and when you should update?
Spencer Greenberg: Yeah. So I think first of all, in my view, it’s very useful to always kind of try to have a prior saying, “Well, okay. I saw this study on chocolate for cardiovascular health. Did I have an opinion about that before? How likely would I think that would be before I saw the study?” I think that can be clarifying. And then you think, “Okay. Now that I saw the study, should I be much more confident?” Again, that goes back to the Question of Evidence. How likely am I to see this evidence of this study result given the hypothesis that chocolate actually improves cardiovascular health versus if it doesn’t?
Then, that’s gonna bring in your view on, “How reliable are studies? How reliable are studies in this particular domain?” And so on. So studies are just one for of evidence and I think that’s easy to miss. I think that some people just don’t care about science. They reject it, which I think is a big mistake. But other people, they think that science is the only way to answer questions, which is also I think a mistake because there are other forms of evidence.
For example, if you walk into your apartment and there’s a stranger standing there, that’s very strong evidence that there’s a person in front of you. Sure, you might be hallucinating but it’s very strong evidence. It’s not scientific evidence, it’s not a repeatable trial, it’s not like you did a study. But it is worth not discounting these other forms of evidence and try and integrate them. So you’ve got these study results. You have to integrate that with the other evidence you have. Maybe your intuition … For example, let’s say you’re evaluating a psych study that had some really counterintuitive psychology result. You have to combine that evidence from the study with your intuitive psychological model of the way humans work to kind of … Really, those are two models of the situation. You wanna synthesize them.
Robert Wiblin: Yeah. So I guess if you’re reading a paper, you kind of want to estimate the Bayes factor that this paper presents to you, which I suppose would go up if the sample was bigger or if the methodology was bigger, and it would go down if you thought that the authors were untrustworthy. Or perhaps if there was a bias in what things got published and what things didn’t. Or if only the most exciting results got published and negative results didn’t.
Spencer Greenberg: Exactly. Yeah, all those factors. You can kind of make a list of a bunch of these factors. If the P-values are really small, that should tilt you a little bit more towards believing it compared to if the p-values are p=0.04, where actually that’s a little bit less evidence potentially. Or maybe that actually could be evidence that they did some funny tricks to get it right below p=0.05.
Robert Wiblin: Yeah. I guess most listeners are gonna be familiar with those points, but maybe just clarifying to think that you’re trying to estimate this Bayes factor which is the probability of seeing this paper if the theory or claim were true rather as probably seeing it if it was fault. And all of these other things that we look at, the reason that they matter is that they change that Bayes ratio.
Spencer Greenberg: Exactly. So suppose you do this thought experiment. How likely would I be to believe the psych result before I saw this paper? Let’s say you think it’s wildly unlikely; you’re like, “One in 100.” Then you see this paper that seems fairly convincing. Well, is the Bayes factor … It’s fairly convincing. Is the Bayes factor 10? Because that would go 10 times one in 100 would only get you to 10 over 100, so the odds would only be 10 to 100. So it’d still be very, very likely to be false. Not as likely as before, but still very likely. Or is the Bayes factor more like 100? If the Bayes factor was 100, then you’d go from a one in a 100 odds before. Multiply that by 100, you get 100 to 100. So now it’s 50-50. It’s like sort of, “What did I think before?” And then trying to adjust for the evidence.
And also, I think that there is a debate, the extent to which papers really provide really strong evidence. But, even if you take the view that a lot of studies are flawed and this kind of thing, one thing that’s cool about studies is they raise hypotheses to attention. The number of hypotheses possible is so insanely large, there’s no way in your lifetime you’ll ever consider even a tiny fraction of all the hypotheses. So a paper can raise a hypothesis to attention that you would never have thought about before, and then you can start to consider, “Do I think this is true?” Maybe you can go gather other evidence about whether it’s true or not.
Robert Wiblin: Another kind of evidence that’s worth mentioning is Heuristics or kind of qualitative frameworks that you can use to estimate things. These days when we’re trying to prioritize the world’s problems we use a quantitative framework where we stick numbers on things and try to estimate specific things like, “How many people are affected and how much?” But before we did that, we had a qualitative framework where we’re just kind of scoring different on like, “How big were they in scale? How hard were they to solve? How many people are working on it?” And it was more just like High, Medium, Low. Sometimes you just have to do that because you can’t really or it’s just too difficult to attach a numerical measurement to something. But nonetheless there can be a lot of informational content there.
I guess another case where I think qualitative frameworks can work quite well is with hiring, for example. If you’re considering hiring someone, a lot of people advocate this kind of heuristic or qualitative rule. It’s like, “Are you excited about them?” If you’re not excited about hiring the person then basically you shouldn’t. I think that’s capturing something where maybe you can’t put this on a spreadsheet because it’s capturing some guessed-all judgment or some overall judgment that otherwise you might miss out on and you need to update based on that, as well. Do you have any comments on that?
Spencer Greenberg: This goes back to the point about … The human brain is a kind of predictive machine and so all the data we gather is fed into this predictive machine. So one way to reframe all this is not like, “Is this heuristic accurate?” Or, “Is this model accurate?” Or, “Is the study accurate?” But if I think of myself as a predictive machine, am I a better predictive machine when I have these tools. I think that’s where heuristics can be really useful is that they can be a quick way of making a judgment that kind of adds on to our predictive machine in a particular domain.
Going into hiring, I think that’s a really interesting example because I think that people are often very biased in hiring; they often misjudge the quality of evidence. For example, I think that people often think that resumes are much more predictive than they are. I have a personal experience around this where one of my employees and I were thinking about hiring someone else to do the same role that she was doing and basically we both judged a bunch of resumes by rating them by how good we thought they were for the job. Can you guess what the correlation was between our ratings of the resumes?
Robert Wiblin: Zero?
Spencer Greenberg: It was basically zero.
Robert Wiblin: Yeah.
Spencer Greenberg: Yeah. Which was shocking at the time, but I think I’ve really come around in thinking that … It’s not that resumes contain no information. Certainly they contain some information but I think that a lot of people overestimate the amount of information they contain. And another thing about that kind of qualitative judgment people make when they’re hiring is I think it can also be overly influenced by things like how good you feel interacting. So that’s heuristic of like, “Do I feel excited?” Maybe some of that excitement is just like you like the person. Not that that’s completely irrelevant; it is nice to like the people you work with, but maybe it’s not as important as you feel it is intuitively. That being said, I think there is a role for qualitative heuristics in some cases.
Robert Wiblin: Yeah. I think people often advocate for them as kind of an error protection or they’re very often, by having a whole different bunch of rules of thumb or just general descriptors that you put on something, it avoids you giving too much weight to any one piece of evidence that you can just define very clearly and put a number on.
Spencer Greenberg: Yeah because sometimes putting a number on a thing, we tend to then try to maximize that number and often that leaves to weird, bad results where you’re losing the bigger picture of what you actually care about.
Identifying mistakes
Robert Wiblin: Okay. So let’s push on. After you’ve kind of produced an estimate or something or a probability judgment, we always try to compare it to something else or give a sense of like, how does this fit into the broader picture of other beliefs that we have to see whether we’ve made a mistake somewhere. And I think one of my favorites is I once posted something on my Facebook about the value of the Dutch East India Company or something in the 16th or 17th century which said that it was an extremely large number. I didn’t do any checks to see whether this was accurate or not because it seemed like it came from a credible source. But someone pointed out that this would mean that the Dutch East India Company was basically as valuable as all wealth in the world at the time.
So they found something that you could compare it to which was like the total amount of wealth that everyone had, and so they were like, “Is it plausible that the Dutch East India Company represents 100% of that? Not really.” So that gave us strong evidence that was false. And then when we chased it up it just seemed to be one of these cases where there’s this citation wormhole where no one knows where the original claim actually comes from and there’s no evidence for it. Do you have any examples of interesting cases where you’ve managed to catch an error by checking the answer against something else?
Spencer Greenberg: Well I think one really useful procedure can be basically having a rundown of like, “What are the sorts of mistakes that I might be making in this particular case?” And actually we built this tool call the Decision Advisor on clearerthinking.org that walks you through making a big life decision. And our idea was: bring the bias training to the moment of the decision so that as you’re making the decision we’re saying, “Oh, is this an emotionally difficult decision? Is this a decision where you might be suffering the Sunk-Cost Fallacy?” So we’re kind of bringing it to that moment.
And I think the same goes for, if you’re making a prediction about the world, you could ask yourself questions like, “Okay. Do I actually have a sufficiently large sample size to be making this kind of estimate? What is my prior here? Am I ignoring my prior?” So you can kind of run down the common set of mistakes that people make and of course you may catch them all. But is this an unreasonably large estimate like in your example? I think this could potentially increase your predictive abilities.
Specific examples
Robert Wiblin: I kind of wanna move onto going through some specific examples of things where we can perhaps start with our priors and then update on a basic look of things-
Spencer Greenberg: [crosstalk 01:39:06] Let’s try it. Let’s see if we can do it.
Robert Wiblin: Yeah. I had a few examples.
Spencer Greenberg: Alright.
Robert Wiblin: The risk of being political which I guess could make us irrational or could get people very excited. What do you think are the odds that Trump, personally, contributed to colluding with Russia in the sense that he gave instructions to someone to pass on to Russia to tell them what to do during the election? Have you thought about this at all?
Spencer Greenberg: You know, I have not tried to estimate the odds of this-
Robert Wiblin: Okay, yeah.
Spencer Greenberg: … but we could walk through how one might try to think about it.
Robert Wiblin: Right. It’s almost better. So I guess … What’s the prior here? So …
Spencer Greenberg: Yeah, so let’s talk about different reference forecasts you could make. One reference would be: How many presidents do we think have colluded with foreign powers?
Robert Wiblin: I think it was probably zero percent or really low.
Spencer Greenberg: [crosstalk 01:39:50] Probably really … I would imagine really low. That could be a starting point, but other things that we could think about is: Are there examples of Trump specifically? We could try to make a reference class from him. Are there examples of him, specifically, making let’s say the self-serving backroom dealings. I don’t actually know the data on that; whether it’s known whether he’s done that before in other domains. But if you did know that, then maybe that would give you a higher starting prior on whether he would have done it with Russia.
Robert Wiblin: I guess you could also just look at how often do people do schemes like that in general, not just a president specifically. Or how often have presidents kind of committed crimes or done things that … So you got all of these other somewhat similar cases that go into your overall judgment. To be honest, I think that caused me to think that it was pretty unlikely, actually, because also just my gut feeling at least when people first started talking about this was that although … Trump, he’s pretty impulsive, but would he be this silly? It’s quite a remarkable claim, I suppose, that someone running for president colluded with a hostile foreign power. So my prior setting up was lower 10% maybe.
Spencer Greenberg: I think one thing that happens here is that for someone who really, really doesn’t like Trump, I think there’s a temptation to agree with anything bad about him. So you have to know and be introspective about: What mode are you in? Are you in the mode of bashing him with your friends? Or are you in the mode of actually trying to figure out the truth about the world? And if you’re in the mode of actually trying to figure out the truth of the world, you have to separate out what you want to be true-
Robert Wiblin: From what actually is true. [crosstalk 01:41:24] What can you support?
Spencer Greenberg: … from what actually is true. And it’s quite difficult to do that when you’re very impassioned about a subject.
Robert Wiblin: Yeah, okay. And then, I guess the main thing that I have updated on since then is this case of his son meeting with a Kremlin contact at Trump Tower. That was the first time that I started taking this potentially seriously, because I would have said that that was very unlikely because it seems fairly outrageous.
Spencer Greenberg: Right, so if we applied the Question of Evidence to that: How likely do you think it is that that would happen if the hypothesis that he colluded with Russia is true compared to it’s not true?
Robert Wiblin: [crosstalk 01:42:00] 80%? Seems, yeah …
Spencer Greenberg: It does seem like the Bayes factor is reasonably large. It seems quite a lot more likely to happen if he did collude with Russia than if he didn’t. Would you agree with that?
Robert Wiblin: Yeah. I think yeah. If I knew that they had colluded, what would be the odds something like this would have happened? Or I guess that I would see that had happened. Maybe 80%? The seeing bit-
Spencer Greenberg: Then you have to compare that to the odds of it happening if he didn’t collude with Russia.
Robert Wiblin: I guess, yeah. To be honest, I haven’t put numbers on this, so maybe this is helpful. But 20 or 30%? So maybe it’s giving me a Bayes factor of three or four?
Spencer Greenberg: Yeah. I might even put it a little bit higher than that, but yeah.
Robert Wiblin: Interesting. Okay. Then if it was originally 10%, then what would it be? That’s a one to nine ratio; now I’m at four to nine or three to nine. So it’s gone from 10% to 30 or 40%. Interesting, okay.
What about the probability that the US and China goes to war in the 21st century? It’s kind of a famous reference class forecast for this, which is, from memory, I think it’s that nine out of 13 times that there’s been a big transition in which is the most militarily powerful country in the world that have gone to war.
Spencer Greenberg: That’s a terrifying reference forecast.
Robert Wiblin: [crosstalk 01:43:12] It is a terrifying … Yeah, a lot of people want to talk about this. I’ll stick up a link to a book that discusses this; it has a name that escapes me right now.
So I guess if you were just gonna take that reference class, you’d say there was a 70% chance or something like that that they would go to war. But I think there’s good reasons not to only take that reference class.
Spencer Greenberg: Yeah, do we have any other reference class that we can use for the forecast?
Robert Wiblin: Yeah. I suppose the broader reference class would just be: How often do two random countries go to war in a century? Much lower than that. What would be a more narrow one?
Spencer Greenberg: Well there’s-
Robert Wiblin: What’s the chance that two great powers go to war in a given century? It’s probably somewhere in between those two.
Spencer Greenberg: And as we talked about before, as you narrow in, you’re gonna get less and less data but it’s gonna be more and more relevant to the case, since you’re gonna try to balance the more data, less relevant, less data, more relevant.
Robert Wiblin: I think the other reason that you definitely wouldn’t wanna take just this probability is that there’s been kind of a regime change since this last happened, which is that nuclear weapons have been invented which seems to have reduced the probability of war in general. So you can’t just look at all these historical case studies, many of which are centuries ago when the world was very different.
Spencer Greenberg: Because with nuclear war it becomes much worse for both parties to go to war.
Robert Wiblin: Then I guess day-to-day, I guess you see these diplomatic spats between countries. There’s current issues around trade between the US and China. And I guess we can always just ask this question: What’s the Bayes factor? So what would be the likelihood of seeing the US and China arguing about tariffs if they are gonna go to war during this century versus if they are not?
Spencer Greenberg: Right.
Robert Wiblin: Seems like often it’s just gonna be pretty close to one; it’s not moving a lot, is it?
Spencer Greenberg: [crosstalk 01:44:41] Yeah. It’s not gonna move a lot, I think, in that case. This actually raises an important point which we haven’t touched on yet which is that making sure that you don’t double evaluate the same evidence. You don’t double count. For example, let’s say you get one piece of evidence about tariffs; China and the US squabbling over tariffs. And then you get another piece of evidence about them squabbling over some other issue. Maybe those two issues are actually very related, and the fact that one happened actually makes the other very likely to happen. Even if one of them had a pretty large Bayes factor, then it actually may mean you shouldn’t update on the other one because it’s almost the same piece of evidence.
Robert Wiblin: Yeah, that makes sense. Okay, yeah, what things might update us a lot? I guess if you actually saw them engaged in a proxy war, that seems to, in the past, be very predictive of countries going to war more generally.
Spencer Greenberg: Threats of war, certainly.
Robert Wiblin: Right, yeah. Then, you’ve got a serious [crosstalk 01:45:30].
Spencer Greenberg: We’re seeing with North Korea, right?
Robert Wiblin: Yeah, yeah. Yeah, let’s talk about North Korea. What do you think are the odds that North Korea will give up its nuclear weapons? I guess we’ll start with the prior as always. Yeah, what kind of reference class are we looking at?
Spencer Greenberg: Yeah. Are there any historical examples of countries giving up nuclear weapons?
Robert Wiblin: I think South Africa gave up its plan to do so. Libya, I think, gave up its nuclear weapons.
Spencer Greenberg: All right. There are a couple.
Robert Wiblin: There’s a couple, yeah.
Spencer Greenberg: There are quite a few nations that have nuclear weapons.
Robert Wiblin: And didn’t give them up, yeah.
Spencer Greenberg: And didn’t give them up.
Robert Wiblin: It’s like maybe one or two out of 10, something like that. Okay, so it seems unlikely. Then, I guess, should we move onto specific things about this case?
Spencer Greenberg: Yeah. I think one thing about the North Korea case is does having nuclear weapons serve a very important strategic strategy that they have? If it doesn’t, if there’s not really any reason not to give them up, we’d think they would be much more likely to give them up. Whereas if there’s a really strong reason to not give them up, then maybe they won’t.
Robert Wiblin: Yeah. I guess we’re slightly rushing through this, but yeah. My guess is that they’re not going to give them up. It is because most countries that have got nuclear weapons haven’t. Also, they seem to have very strong incentives to keep them. As long as they have nuclear weapons, as long as we think that they probably have nuclear weapons, then we’re much less likely to threaten them. They can save money even on other military things.
Spencer Greenberg: Yeah. You’d imagine it gives them great negotiating power, and also may greatly reduce the chance of invasion. Who wants to invade someone with nuclear weapons, right?
Robert Wiblin: Yeah. I mean are there any arguments in favor of them giving it up that we should consider?
Spencer Greenberg: Has there been any news stories that we should update on?
Robert Wiblin: Okay, yeah. They’ve said that they are going to give them up, or they keep talking about this. Yet, I place almost no weight on this.
Spencer Greenberg: Yeah. We can ask the Question of Evidence, how much more likely would they be to say that they were giving them up if they actually are going to give them up than if they’re not? I would say a bit more likely at least.
Robert Wiblin: Yeah, interesting. Do you want to put any numbers on it? Try to give a factor.
Spencer Greenberg: I would say probably three to one.
Robert Wiblin: Three to one, interesting. Okay.
Spencer Greenberg: Do you think it’s less than that?
Robert Wiblin: Yeah, I think so.
Spencer Greenberg: Keep in mind, if they weren’t going to give them up, why would they say this particular random thing? Do you see what I’m saying? The fact that they chose to say this particular thing, it does seem, in a world where they’re not going to give them up, it would be a weird thing to go out and say.
Robert Wiblin: That’s the thing. I think that if they weren’t going to give them up, then they would have reason to say that they’re going to do it anyway. Such that like it’s almost, even if they were going to give it up, it’s almost impossible for them to communicate it. This would be my model. They’re trying to suck us in to try to get concessions from us, get us to do things that they lack, try to make us less worried so we stop pressuring them so much. My model is that they’re going to string us along and claim to do it. They may not even have decided whether they’re going to, but it serves their interest to say this basically no matter what.
Spencer Greenberg: That makes sense. Although that seems to be just one strategy among multiple they could have run.
Robert Wiblin: That’s true, yeah.
Spencer Greenberg: So how confident are you that they would have run that strategy? Assuming they’re not going to give them up, how likely would they have chosen that strategy instead of another, right?
Robert Wiblin: Yeah, okay. Yeah. Maybe I’m being … Yeah, it’s true. The view from inside my head is like, “Just don’t listen to a word that they say.” Maybe I’m wrong about that model of how they’re operating.
Spencer Greenberg: Yeah. I would argue that’s not quite the right model. The right model, to update on evidence, is the Question of Evidence.
Robert Wiblin: Yeah, okay.
Spencer Greenberg: In a world where they’re not going to give them up, you have to say, “Well how likely would they be to use this particular strategy?”
Robert Wiblin: I see.
Spencer Greenberg: I mean it doesn’t seem shockingly unlikely, but maybe there are at least two or three different strategies they could have used, maybe only one of them involves pretending they’re going to give them up or something like that.
Robert Wiblin: Yeah, okay. Obviously we could spend a lot more time talking about this. Maybe we’re starting with 10%, then we’re updating downwards because we think it doesn’t make sense for them to give it up. Then, we’re going to update back upwards because we think, well, at least they said. That’s some information.
Spencer Greenberg: Some information.
Robert Wiblin: Go from 10 to five to 15 or something like that. Yeah. Are there any other cases that you have interest in talking about? I’ve got some others here.
Spencer Greenberg: Go for it.
Robert Wiblin: Okay, yeah. Yeah. Someone wrote in and was very interested to hear your view on dietary advice. They’re saying, “I have a degree in physiology, but am still baffled about what meta-analyses or experts I should trust whenever it comes to dietary advice.” Typically, they say they just stick to vegetables are generally good. Lots of sugar isn’t that good. They don’t think that there’s a whole lot else that we know. What’s your view?
Spencer Greenberg: Yeah. I think dietary advice is a really interesting question. I think part of the problem here is that I’ve come to believe that humans are much more different from each other than is maybe generally acknowledged.
Robert Wiblin: Yeah.
Spencer Greenberg: In other words, the fundamental question of should you eat X or should you eat Y is maybe a malformed question. Maybe some people should eat X and some people should eat Y. Maybe it depends on their biology, maybe it depends on their behavior, how much they exercise or what environment they live in. Maybe, in some sense, we’re asking questions that are too simple. The problem is it’s very hard even to ask the more complex questions and answer them.
In order to actually really study this stuff in a really rigorous way, we’d often need to randomize people to get different diets for long periods of time, and make sure they actually eat those diets. That’s very expensive, it’s very difficult. How do you even ensure that someone’s eating the right diet that they claim? It’s possible to do this, but it’s rare.
Furthermore, if you don’t randomize people, let’s say, you just observe, “Oh these people eat spinach and these people eat meat. Who is healthier?” The problem is if you’re not randomizing who eats spinach, well eating spinach is probably also correlated with eating kale and eating carrots and so on. How do you know it’s the spinach, not the carrots or kale, that’s having an effect on their health? I think that this field in particular, it’s very hard to really update that much on the evidence.
Robert Wiblin: Yeah, yeah. That’s kind of my view as well, except with a handful of exceptions. Just yeah, all of the evidence people seem to present for particular foods being especially healthy or unhealthy just don’t seem terribly convincing. They don’t have a big Bayes factor. I’m just mostly just sticking to my prior. I guess what’s my prior? I guess my prior is that most of these specific claims just aren’t true. Most things just don’t have that much effect on people’s health. Even after I see all this evidence, I still don’t think that we know all that much. We don’t know that many specific claims.
Spencer Greenberg: Yeah. I think sugar, one case against sugar is that some people think that … Some studies indicate that it causes all kinds of bad health effects and so on. Maybe you don’t believe those studies that much, maybe you do. At the best, it’s just empty calories. At the best, it has no nutritional value.
Robert Wiblin: Yeah.
Spencer Greenberg: For sugar, the range is no nutritional value, empty calories to it’s actually, very specifically bad for you. There’s not really much upside in sugar except taste.
Robert Wiblin: Yeah. There’s not many people arguing in its favor.
Spencer Greenberg: Yeah. It tastes good.
Robert Wiblin: I suppose, yeah, that matters. I guess the case that I mentioned, that we both can see that we do have pretty strong evidence, is about specific nutritional deficiencies where you can observe someone having very bad ill health in a specific way and is consistent across lots of people. Scurvy, yeah, scurvy, classic case. It’s like what is the Bayes factor on every time someone doesn’t consume vitamin C, they have all of these same symptoms.
Then, as soon as they take vitamin C, they start to go away. We can just replicate it again and again. The Bayes factor is enormous, 100, 1,000. That’s the kind of case where it’s like, “Yes. I would have had a very low prior on this specific claim that vitamin C does all these things.” Then, I started with 1%, .1%, but now I’ve had 100, 1,000 fold update in favor of that being true.
Spencer Greenberg: Personally, I’ve had a very demoralizing experience of looking into different supplements where you have these supplements where it looks, at first, like evidence is quite strong and maybe that it’s really helpful. The more you investigate and the more you read the studies, it’s like the stories don’t really hold together that well to the point where it’s actually pretty hard to find supplements that a healthy person, that doesn’t have any specific issues, should just be taking every day.
One possible exception is there does seem to be some evidence that older women especially should potentially take vitamin D. I’m not going to say I definitely, absolutely believe that but there are meta-analyses, randomized control trials that indicate this is helpful for older women. There aren’t a lot of results like that that when you really dig in, you’re like, “Oh yeah. This is definitely the case.” Yeah.
Robert Wiblin: Yeah. My view on almost all of this stuff is it might be true and it’s not that expensive. There doesn’t seem a downside that’s so large.
Spencer Greenberg: Yeah.
Robert Wiblin: So you end up giving it a go. In this specific case, probably not going to help that much.
Spencer Greenberg: Yeah. One of the things I find really interesting is when someone switches diets and has a seemingly huge effect. Now, they used to eat whatever. Now, they became vegan and, suddenly, their arthritis that they had for 10 years seems to go away. Now, of course, in those cases, we can be skeptical. We can say, “Well are they sure that they’re remembering properly? Did they really have arthritis for 10 years? Maybe they had it only for two months.”
Robert Wiblin: Yeah. Maybe these things just go into remission anyway at times. Yeah.
Spencer Greenberg: Right. We can actually use a Bayesian way of thinking about this.
Robert Wiblin: Yeah.
Spencer Greenberg: We can say, “Well how likely … ” Let’s suppose that we knew the details of the case were correct. They had bad arthritis for 10 years. They switched diets and then, within two weeks, it went away completely. Let’s say it’s never gone away before in the 10 years. We can begin to think about in in a Bayesian way.
If we know those were the effects, we can say, “Well how likely would we be to see this evidence of it going away in two weeks if our hypothesis that the diet change was the cause of it going away was true compared to if it’s not true?” If you think about two weeks divided by 10 years, if it had never gone away before, that’s actually pretty strong Bayes factor. It’s a really strong Bayes factor.
Robert Wiblin: Yeah. If it happens that quickly, although your prior on the diet getting rid of the arthritis just might be very low.
Spencer Greenberg: Might be very low.
Robert Wiblin: Maybe it’s a 1% or a one in 1,000 chance you had before. Then, you get this update that’s like, what?
Spencer Greenberg: Potentially very strong update.
Robert Wiblin: Potentially very large, yeah. Okay.
Spencer Greenberg: It’s interesting to think about how if you had something that’s very consistent for a very long time, and then you do something purposely with the hypothesis it’s going to change a thing, and then that thing very rapidly changes, you can have huge Bayes factors that could be convincing at n without a randomized controlled trial that it actually worked, at least for that person. It doesn’t mean it will generalize to other people.
Robert Wiblin: Yeah. It’s interesting that the rapidity matters so much, because it reduces the probability of a false positive quite a lot potentially.
Spencer Greenberg: Exactly.
Robert Wiblin: Yeah, okay, so just trying to estimate another thing, the odds of Trump being re-elected say. That’s one where we have pretty explicit models and lots of data, such that our estimates can be fairly robust, at least compared to some of these other things that we’re trying to estimate which are more like individual unique cases. I think it seems like the things that we have the best modeling for, and the best probability judgments for, are sports results and the weather. Then, elections are not quite as good as that because the data set is not quite as large.
Spencer Greenberg: We do have prediction markets, which are interesting potential priors, say, “Well if we actually have betting markets where people are putting their money on this, maybe we start with the odds of the betting market as our prior.”
Robert Wiblin: Yeah, okay. Let’s say that we didn’t have the betting market, because that’s almost cheating. They’ve done all the work for us, although probably listeners at home should do that. That’s what I usually do myself. I guess what we can do is look at the polling data and then say, “What fraction of the time that the polling has been the way that it is now has one side won versus the other when the difference was what it was?”
Spencer Greenberg: Mm-hmm (affirmative).
Robert Wiblin: Of course, then, well, if you only look at presidential elections, then you’ve got a fairly small sample. They only happen every four years. Also, many of them will be quite a long time ago when the situation might have been different. Nonetheless, you’ve got a dozen that seem pretty relevant. You could also try to be more fine grained and look at how often does the … What’s the time series nature of the change in the vote difference between the two parties? You get a sense of what’s the variance over a year, yeah, how quickly, yeah, how rapidly do these things evolve?
Spencer Greenberg: Mm-hmm (affirmative).
Robert Wiblin: You got, yeah, explicit time series models. I guess, so that’s the more explicit quantitative side. Then, people also will play their own judgment, I suppose, to do we think that this person is likely to do a good job? Do we think they’re a good campaigner? It seems like most experts think that, or most people who I trust on this, think that people used to place way too much weight on that kind of personal qualitative judgment about the situation.
Spencer Greenberg: Right. I think sometimes people, something breaks, some scandal, something like that, and people’s probabilities, judgements change dramatically. Then, next week, everyone’s forgotten about it.
Robert Wiblin: Yeah. They’re not considering the outside view of, yeah, how often does a scandal like this actually change the results in the long-term?
Spencer Greenberg: Right, right. Obviously a sufficiently bad scandal might really wreck someone’s career. If it’s a minor thing, well a lot of times, people just eventually forget. They get over it, whatever.
Robert Wiblin: Yeah. Do you have any comments on this kind of case, of how people ought to treat it? I suppose you think it’ll start with the prediction market and then adjust from there.
Spencer Greenberg: Yeah. It’s interesting, can you beat the prediction market? It might be pretty tricky. If it’s a really liquid market where you have professionals who are spending all day long trying to beat the market to make money, it might be really hard to beat it. As an amateur, you probably should just go with the probability they gave.
Robert Wiblin: Yeah.
Spencer Greenberg: Let’s say it’s a market that’s not that liquid. They’re not having many people trading it. They’re not professionals, they’re just people with gut judgment. Then, you can say, “Okay. Here’s what the prediction market says. This is the odds it gives. Do I have reason to think that the prediction market might over or underestimate in this particular case?” Like, “Do I think maybe … ” There are a lot of anti-Trump people who are playing this prediction market who might be letting their personal biases, what they want to be true change their probabilities.
Robert Wiblin: Yeah. I guess let’s apply a Bayes factor to a specific poll that comes out.
Spencer Greenberg: Sure.
Robert Wiblin: If we get, yeah, a poll and, yeah, generic Democrat versus a Trump for presidential election right now, I mean what would be our Bayes factor? One basically. Nothing more or less. I suppose we should have already absorbed all of the other polling information that’s come out. We’re so far away from the election, if it was a surprising result, we probably wouldn’t trust it anyway.
Spencer Greenberg: I think if it was sufficiently strong. If Trump just seemed to be getting creamed, even by people who are traditional conservatives, that would be pretty interesting. Then, maybe you’d start to think, well maybe this … But it probably wouldn’t be huge. Probably wouldn’t be a Bayes factor of 40, because there’s still so much uncertainty about how things will pan out.
Robert Wiblin: Yeah. If you got a really surprising result that was inconsistent with all of the other polls that were done in the last week or the last month, you might just think that they made a mathematical error. If it was surprising enough to move it, given that one poll is so small in the scheme of things, then …
Spencer Greenberg: That’s a great point. I think about this with studies in psychology. As the effect size of the study, the strength of the effect goes up, my Bayes factor goes up. My probability that this is a real result goes up. If it’s a really small effect, maybe it’s just noise or is also not that useful, but as the effect size goes up, I tend to believe it more.
Then, after some point, as the effect size gets too large, I actually start believing it less again. Then, it seems unlikely that you’d actually have such a crazy strong effect. You get this thing where your probability goes up and then it goes down again. Then, at some point, you’re like, “Yeah. Maybe that’s a fraud. Or they really messed something up somewhere.”
Robert Wiblin: Yeah. Maybe an easier example is, let’s say, you have a sense of how warm the room is. Then, you go and look at the thermometer. Let’s say that I thought that was 20 degrees. Then, I look at the thermometer and it says it’s 25. It would be like, “It was warmer than I thought.” If it says it’s 30 degrees, it’s like, “Wow. That’s really quite surprising. Maybe I was wrong about the temperature.” If it says it’s 100 degrees, you’re just like, “Well I’m not updating on this at all. Clearly that thermometer is totally broken.”
Spencer Greenberg: Right. One way to think about that is that you’re saying, “Well what’s the probability I get this evidence of the thermometer being like that, given that the room is actually this temperature and that I assessed it being this temperature?” You can have this additional thing that you’re conditioning on, which is your own assessment of the temperature. Once you’ve taken that into account, now the thermometer seems more likely to be broken. If you didn’t have your own way of assessing, then all you would have to update on is the thermometer.
Robert Wiblin: Yeah. Do you just want to break that down? Yeah, how does it look differently when you have your own judgment as well?
Spencer Greenberg: You’re saying, “How likely am I to see that the thermometer says it’s 3,000 degrees in my house?”
Robert Wiblin: Yeah. Given my sensation and the idea of the room being that temperature, yeah.
Spencer Greenberg: Now, you’re saying, “Well it actually seems like … The most likely hypothesis is actually that it’s broken. I’ve conditioned on my own assessment, which is not going to be that confused, right?”
Robert Wiblin: Interesting. Yeah, I guess, so you see with these prediction markets that they tend to move very gradually. Yeah. Individuals seem to want to jump their probabilities around a lot. They tend to not move and then suddenly move a lot when they see something that somehow crashes through into their beliefs. Yeah, at least liquid prediction markets, they tend to just creep up 1%, down 1% each day as little bits of new information come in and people slightly change their minds, which is probably how we ought to operate if we were more reasonable.
Spencer Greenberg: Yeah, absolutely. If you don’t do those little adjustments, there will be certain things you’ll never change your mind about, the sorts of things where you never get overwhelming evidence all at once. A perfectly rational agent doesn’t care whether it gets three packets of evidence altogether or them spread across time, right?
Robert Wiblin: Yeah, okay. Here’s another case. I’ve been reading this book, Bad Blood, about Theranos, this company that was going to revolutionize blood tests.
Spencer Greenberg: Interestingly enough, I knew people that thought it was a fraud years ago.
Robert Wiblin: Amazing, okay. Yeah. I was thinking, I suppose the question is, yeah, someone comes to you in the Bay Area and they say, “I’ve got this new amazing technology that’s going to, yeah, change medicine. I can, yeah, do this incredible thing.” How should you evaluate how likely it is to be a fraud or not?
Spencer Greenberg: I think usually, the vast majority of cases are not frauds. They’re people who are overly optimistic or over confident.
Robert Wiblin: It seems to be actually how this started, that it was a not an outright fraud from the beginning.
Spencer Greenberg: It wouldn’t really make sense to start out on a plan of, “I’m going to go convince people that I’m going to create this technology. I’m not actually going to try to build it. I’m just going to get a bunch of money and squander it.” It doesn’t make that much sense. It seems much more plausible that someone who thought maybe they could do the thing, but then as it becomes increasingly clear they can’t do their thing, there’s kind of such a momentum and inertia, and you’ve already told everyone you can do it. You’ve already got all their money.
Robert Wiblin: You keep stringing it along, yeah, to delay what’s, I guess, inevitable at that point. Okay. Again, let’s start with the prior, start with the reference class. I guess what fraction of cases are over optimistic to the point that it’s borderline fraudulent?
Spencer Greenberg: Yeah.
Robert Wiblin: Yeah.
Spencer Greenberg: If we’re really talking about frauds, I think there aren’t that many really at the end of the day.
Robert Wiblin: A few percent, yeah.
Spencer Greenberg: A few percent, yeah. I think it’s pretty small. Actually, so I’ll give you a piece of evidence. Years ago, when someone I know claimed that Theranos was a fraud, I was really interested in this. I went and looked at one of the papers that they published. The paper was really bad. You’re reading this paper and you’re like, “What? This is really not convincing paper about their technology.” That was a very confusing piece of evidence to me. If you were doing a fraud, you might think, well you’re just going to go on. You’ll just make up all the data. You’ll have this beautiful result. It was actually …
Robert Wiblin: The paper was so bad, it almost had to be a sincere bad attempt.
Spencer Greenberg: You kind of thought, yeah. It made me think that their technology is less likely to be good, but actually I’m not sure if it actually would make me think they were a fraud. It might make me think they were honest, but not very competent. Yeah.
Robert Wiblin: Interesting, okay. I think another phenomenon that I have here is whenever people … They’ll often say, “I’ve got this new and somewhat surprising method that people report makes them feel a lot better.”
Spencer Greenberg: Yeah.
Robert Wiblin: I guess Theranos isn’t a great example of this. I am extremely skeptical of these cases, because we just have so many instances through history of people feeling like something has helped them. In a sense, it has. All it’s doing is giving them a placebo effect. You’ve got faith healers or religious stuff where I don’t believe that’s there’s … Well there is a biological effect but it’s only psychosomatic. Yeah.
Spencer Greenberg: This goes back to the idea we were talking about before, about when can your intuition learn something? We have a very tight data feedback loop on how we feel moment to moment and whether it’s changed. If a faith healer touches you, you can tell that suddenly you feel something different. Now, maybe that’s psychosomatic, but you know that you feel that. What’s much harder to tell is is your arthritis slightly better now over the next few months? Or is that just random variation, right?
Robert Wiblin: Yeah.
Spencer Greenberg: I think what happens is some of these things don’t work, but they can make you feel better momentarily. Then, after that, maybe you happen to get better by chance or you’re not really sure, and you might read into it positively.
Robert Wiblin: Yeah. It’s interesting. It seems like the intuitive plausibility of the mechanism matters so much here. If someone says, “Oh I took this drug and it made me feel better,” my prior on that being true is low for a specific chemical, but not that low. I know that drugs can change people’s brains.
Spencer Greenberg: It’s probably low if they picked the chemical completely randomly.
Robert Wiblin: That’s true, yeah.
Spencer Greenberg: If there was, let’s say, you knew that their doctor prescribed it, then maybe it would be a lot higher than if they just picked it out of a bowlful of random drugs, right?
Robert Wiblin: Yeah. An interesting example that came up recently was someone talking about transcranial magnetic stimulation. They put these magnets, very strong magnets, next to your skull.
Spencer Greenberg: Yeah.
Robert Wiblin: Some people report that this helps with depression or various other mental health issues.
Spencer Greenberg: Yeah.
Robert Wiblin: I think many people look at this and think that it’s kind of ridiculous. They have a very low prior that it’s true. I’m just saying, “Well I mean we know that it affects the brain. We can see that. We can see that it’s causing particular neurons to fire. Is it really that unlikely that this could have a positive effect somehow on people’s brains?” I guess it’s surprising, but it’s not crazy surprising.
Spencer Greenberg: Right. I wouldn’t put it at one in 10,000.
Robert Wiblin: Exactly.
Spencer Greenberg: I wouldn’t put it at 50%.
Robert Wiblin: Right. It’s maybe more like one in 100, one in 1,000.
Spencer Greenberg: Yeah. It’s something like that I would imagine. Yeah. Then, there are studies that they come out with n=20, and they say that people, 30% improvement in their depression. It’s like, “Well what do you make of that?” Basically it’s how
much do you trust that scientific process that they’re carrying out? Yeah.
Robert Wiblin: What’s going on in this case is where I think it has to be fraudulent or people deceiving themselves, so if you’ve got the, yeah, the faith healer or someone who is, yeah, laying hands on people. I guess, so one thing is I’m starting with the reference class that this falls in is very bad. So often, I think historically this has been gurus or, yeah, people doing dodgy stuff.
Spencer Greenberg: So you might start with a low prior in that case.
Robert Wiblin: The prior is low because I don’t think, yeah, touching people is going to cure many of these conditions that people might claim. Then, it looks even worse because I look around and say, “Oh yeah. There was a long history of people deceiving themselves about this kind of thing, about, yeah, yeah, excessive medical claims that are implausible on their face.”
Then, typically, the evidence in its favor seems also really weak, because they don’t have control groups. It’s not a properly designed study. It was low, gets even lower, and then I don’t really update on people saying that it works for them. Or, well, I believe that it kind of does psychologically. I don’t believe necessarily it’s going to cure cancer or … Yeah.
Spencer Greenberg: There was a meta-analysis that was done looking at the power of placebos. A lot of people don’t realize this, but when people talk about the effects of placebos, they actually often aren’t really talking about just placebos.
Robert Wiblin: Yeah.
Spencer Greenberg: Let’s say they do an intervention, give someone a drug and then they have a placebo group. The placebo group gets quite a bit better, people will say, “Oh the placebo effect.” A lot of that could be things that are not the placebo effect. For example, it might be that the disease tends to improve on its own, on average. Or it might be that there’s a reversion to the mean. You go to the doctor, you enroll in the study when you’re particularly sick, which is actually above your average and you’re going to mean revert on average.
Or there could be reporting bias effect, you feel like you’re supposed to tell the doctor you’re feeling better or you notice better feelings once you start becoming aware of them, but you’re not actually doing better. We tend to very often lump all these things and call these a placebo. No. That’s not a placebo. The placebo is a very specific effect where believing the thing actually makes you feel better.
Robert Wiblin: Yeah. I’d like to talk about a blog post that outlines the evidence that a lot of what we call placebo is just regression to the mean, that people come in for treatment when they’re doing particularly badly and then they just tend to improve always on average.
Spencer Greenberg: Yeah. There was this great study that actually analyzed placebo effect versus wait list control. In studies that happened to have three study arms, like intervention, placebo and wait list control, and that actually led to beginning to say, “Well how big is the actual placebo effect without these other factors?” Yeah.
Robert Wiblin: Yeah. I think you get the biggest actual placebo effect when you just ask people after the thing, “Do you feel better now?” Or, yeah, use subjective judgment.
Spencer Greenberg: I think they found that on subjective judgements that are continuous. On a 1 to 10 scale, how bad do you feel or how much pain do you feel? That was where they found the strongest placebo effect. It’s still weaker, I think, in the study than a lot of people think it is but they did find it. They didn’t tend to find it on things that were objective measures, like heart rate or that kind of thing. There, they just couldn’t even really detect it.
Robert Wiblin: Interesting. Okay, so this was a big diversion from Theranos. Some of your friends thought that it was not a legitimate technology. What updating did they do there, do you know, that gave it away to them?
Spencer Greenberg: One thing is that Theranos kept saying they were going to release their product, and kept not releasing it. This went years and years. Have you heard the phrase vaporware?
Robert Wiblin: Yeah.
Spencer Greenberg: This was one of the biggest vaporware cases for, I think was it 10 years until they got their product to market? Something like that.
Robert Wiblin: Yeah, yeah. Something like that. I mean I don’t know that a proper thing ever really came to market. It was like always pretty faulty. Though I’m only partway through the book, so maybe I’ll find out. Okay, so their thing there was it’s a big claim. They’re claiming to be able to do something that many people think is very hard, and other companies are not close to being able to do. The chances of them succeeding at doing it perhaps is a better thing to say. Maybe you already think it’s 10%, 20%.
Then, at the beginning, they claim that they can do it already or they’re extremely optimistic about it. Maybe then update in favor, you’re like 40%, 50% likelihood that they’ll do it. Then, years go by, they don’t really publish papers. The product is never actually released. You’re like, “Well now this is falling into a pretty bad reference class here of companies that claim to be able to do something, and then just go years without ever releasing it.” Eight out of 10 times, it’s actually because it’s bullshit.
Spencer Greenberg: Right. Actually this leads to another somewhat more advanced way of using the Bayes factor, which is we’ve been talking about using it to compare a hypothesis to not that same hypothesis. The Question of Evidence being how likely am I to see this evidence that the hypothesis is true relative to whether it’s false? Another way to use it is to compare any two hypotheses. You could say, “How likely am I to see this evidence if hypothesis one is true compared to if hypothesis two is true?”
Then, the prior there would be the prior odds of hypothesis one versus hypothesis two. You’re going to multiply those prior odds times the Bayes factor to compare the two hypotheses. There, you could have one hypothesis is that maybe they’re just not very competent. Another hypothesis that they’re fraudulent, and you could actually try to do a Bayes factor on the relative strength of evidence towards those two hypotheses.
Robert Wiblin: Yeah, okay. Can we just try to do that here? Let’s say that I originally thought there was a 40% chance that they would release a product that was able to do what they claimed within 10 years. Then, five years go by and they’ve released nothing. What would be the Bayes factor here?
Spencer Greenberg: I think if they kept claiming, “Oh the technology is ready. We’re about to release it,” and kept not releasing it, to me, that pushes more in the direction of fraud than in the direction of incompetence. Whereas, on the other hand, if maybe early on, they said, “We’re going to have this done in one year,” and then it didn’t come out in one year, it kept going by, but they didn’t keep reclaiming that it’s …
Robert Wiblin: Oh I see.
Spencer Greenberg: Then maybe that pushes more in … They thought they could do it, but it’s really… not necessarily incompetence, but just really hard. Maybe they’re not up to the challenge.
Robert Wiblin: How confident were your friends that something dodgy was going on?
Spencer Greenberg: Pretty confident some of them, yeah.
Robert Wiblin: Interesting.
Spencer Greenberg: There were interesting message board discussions going on for years where people would say, “I am a person who works in biology. I think this is completely implausible the way they’re claiming they’re doing this.” I’m not an expert in that topic, so I couldn’t really evaluate those claims. There were certainly people making those claims for a long time.
Robert Wiblin: Are there any other interesting updates that we could try to work through for this case?
Spencer Greenberg: One thing that’s kind of interesting is that my understanding is that they didn’t really get Silicon Valley investors in it. I don’t know why that is. I don’t know whether that’s because Silicon Valley investors didn’t value it. They’re like, “I don’t want to take part in this.” Or whether just the network that the CEO had was a different network.
Robert Wiblin: Maybe they just thought it was too far out their area, the Silicon Valley people, this is too medical.
Spencer Greenberg: They invest in biotech a lot.
Robert Wiblin: Okay.
Spencer Greenberg: It does raise an interesting question, I think.
Robert Wiblin: Yeah, okay. The fact that, yeah, they were talking to lots of investors and, yeah, some of the ones you might have expected to invest didn’t suggested there was some red flags here potentially.
Spencer Greenberg: Potentially.
Robert Wiblin: I’m going to let you go. But a listener had a final question for you which was … Well, hopefully listening to all of this, we’ve made people a little bit more rational or, at least, given them the illusion of being a bit more rational. Someone was wondering the more that they’ve tried to be rational about the world and about issues that people have a lot of strong feelings about, the more alienated they feel from other people who I think, especially in politics, they think, I’m trying to form accurate beliefs.
I guess their views have become idiosyncratic and unusual. They just don’t feel that much affinity with their family and friends who perhaps still have more mainstream beliefs that they regard as irrational. Yeah. How do you cope with that alienation? Do you experience it yourself?
Spencer Greenberg: One thing that I’ve tried to do is really train myself to have a smoke detector alarm that goes off when certain kind of bad arguments are made, certain types of rhetorical fallacies or things that might indicate cognitive biases or just bad argumentation or statistical fallacies. I think that’s super useful in trying to figure out the way the world is. You have these alarm bells that are like, “That’s not a valid argument. That’s not a valid argument.” It can be frustrating then once you’ve trained yourself to do this, if you’re talking to people who are not using good argumentation, it can be annoying because those bells keep going off.
Robert Wiblin: You can’t interrupt them just constantly. It would be so rude and you would never get anywhere. Yeah.
Spencer Greenberg: Yeah. It wouldn’t be productive. I think a mindset switch, that can be really useful in this case, is just because someone is making not a great argument for a thing doesn’t mean the thing that they’re saying is not true. A bad argument doesn’t make something false.
Robert Wiblin: The fallacy fallacy.
Spencer Greenberg: Yeah. They might still have valuable things they’re saying. I like to try to give, when someone’s making an argument to me, I like to try and give them the benefit of the doubt. Even if they’re making some … They have some missteps in the logic, there might still be something there and trying to, this phrase of steelmanning, trying to think about, okay. If I really try to strengthen their argument, what does it say? What can I learn from that?
An example of this is most startups fail. If someone comes to you with their startup idea, you could just say, “You’re probably going to fail,” and just dismiss it. You might miss out on some really great ideas, the ones that are actually really good. You can actually … I think it can be more helpful to say, “What’s the best thing about this idea?” Not, ” Are there flaws in the way it’s being presented to me?”
Robert Wiblin: Yeah, yeah. I think that’s extremely helpful. I suppose that’s good if you’re talking with someone who is making some mistakes, but also has a lot of wisdom to share. What about when something is just an absolute epistemic disaster? I guess there, I mean it’s a mad world out there, but I do just try to laugh at it. If you can reframe things, I mean if you lower your expectations, because just actually trying to be rational, as we’ve described, is so difficult.
Robert Wiblin: There’s like so many steps you have to go through. I mean who really has the time to be estimating these Bayes factors for everything? It’s not surprising that we get things wrong. If the expectations are lower, then you can find mistakes potentially quite hilarious and absurd rather than aggravating.
Spencer Greenberg: I think it would be useful to remember that our brains are not rational machines. We’re never going to be rational. There can be a lot of value in incrementally pushing forward, clearing our thinking as much as possible, trying to find our biases and that can help us live better lives. It can help us make better predictions. It can help us help the future, because we can actually think carefully about how can we do more good.
I think it’s worth it. If we remind ourselves that we are all irrational, that can help us have more empathy for others and say, “You know what? Okay, that person is saying silly things. I’ve been in that place before. I’ve said silly things before. Maybe one day they’ll have better arguments for the things they’re saying.”
Robert Wiblin: My guest today has been Spencer Greenberg. Thanks for coming on the podcast, Spencer.
Spencer Greenberg: Thanks so much for having me.
Robert Wiblin: There’s a bunch of interesting links in the blog post attached to this episode – is the placebo effect real? Is power posing legit? Is it a myth that spinach has lots of iron? Or is it a myth that it’s a myth?
You can find out by clicking through the the post about this episode and scrolling down to the section about articles discussed in the episode.
Oh and you should definitely give Spencer’s common misconceptions quiz a go.
The 80,000 Hours Podcast is produced by Keiran Harris.
Thanks for joining – talk to you next week.
Related episodes
About the show
The 80,000 Hours Podcast features unusually in-depth conversations about the world's most pressing problems and how you can use your career to solve them. We invite guests pursuing a wide range of career paths — from academics and activists to entrepreneurs and policymakers — to analyse the case for and against working on different issues and which approaches are best for solving them.
The 80,000 Hours Podcast is produced and edited by Keiran Harris. Get in touch with feedback or guest suggestions by emailing [email protected].
What should I listen to first?
We've carefully selected 10 episodes we think it could make sense to listen to first, on a separate podcast feed:
Check out 'Effective Altruism: An Introduction'
Subscribe here, or anywhere you get podcasts:
If you're new, see the podcast homepage for ideas on where to start, or browse our full episode archive.