Estimation – Part I: How to do it?
Why you need to make estimates
Trying to answer questions about the impact of a career is difficult, and trying to decide between different career options is even harder. If I asked you ‘How many people will benefit from research into anti-malarial vaccination?’ or ‘How many malaria nets would a £1000 donation to the Against Malaria Foundation get?’, your first answer will probably be that you don’t know. After this you will probably try to Google the answer, but in most cases the information that you need is either not easily accessible or it would cost you a lot of time and money to find it.
Once you realise this, you need to make a choice; do you give up or do you estimate the answer? If you consider a question or a decision important enough to search for data, chances are that giving up is not an option.
In this post we are going to look at how to make an estimate of a quantity you’re uncertain about. In order to do this we will cover the following things:
* Making your estimate
* Is your estimate a good one?
* Are you overconfident?
* Calibration: overcoming your overconfidence
* Case Study: When did Einstein win his Nobel Prize?
* How to use estimations
After going over the above, you will be able to make useful estimates for quantities that relate to your career. A second post will then examine the more complex task of how to compare and combine different estimates.
Making your estimate
Let’s consider a simple question for which the answer is known; In which year did Einstein win the Nobel Prize for Physics?
This question might seem impossible to answer without looking online. We could pick a single year, say 1935, but picking a random year probably won’t give us the correct answer and isn’t very useful if it’s wrong.
When making an estimate it is much better to give a range of values that the answer probably lies within. So we could say that we think Einstein won the Nobel Prize between 1925 and 1945.
But what exactly do you mean by this range? Are you absolutely certain the date lies in this range? Even better than just citing a range, cite a range and say how likely you think it is that the answer is in that range. To do this we assign a probability to the range that we have suggested. People often aim for an estimation which has a 90% probability of containing the correct answer. A range of values with a given probability of containing an answer is called a Confidence Interval (CI). We will assume that our range of 1925-1945 is our 90% CI – that means we think there’s a 90% chance that Einstein won the Nobel Prize between these dates.
Is your estimate a good one?
If we look at our 90% CI, we picked it fairly arbitrarily. Do you really think that there is a 90% chance that the answer is between this range? It turns out that most people are bad at making estimates because they are over confident and most of the ranges they choose are too narrow. Fortunately there are some simple tips for improving your ability to pick accurate confidence intervals and we will now cover these.
Are you overconfident?
Before we look at how to improve our estimates, we need to check what our starting point is. To find out whether you are overconfident or not write down the lower and upper bounds of your 90% CI for the following 10 questions:
- In 1938 a British steam locomotive set a new speed record by going how fast (mph)?
- In what year did Sir Isaac Newton publish the Universal Laws of Gravitation?
- How many inches long is a typical business card?
- The Internet (then called “Arpanet”) was established as a military communications system in what year?
- In what year was William Shakespeare born?
- What is the air distance between New York and Los Angeles (miles)?
- What percentage of a square could be covered by a circle of the same width?
- How old was Charlie Chaplin when he died?
- How many pounds did the first edition of the book ‘How to measure anything?’ weigh?
- The TV show Gilligan’s Island first aired on what date?
You can find the answers1 to these questions in the notes at the bottom of this post. If the estimates you gave really are your 90% CI, you should get around 9 of the answers correct. If you find you got 7 or fewer answers correct, as most people do, then you were over confident with your answers. In reality the ranges you gave were too narrow, probably representing a 60% or 70% CI instead.
Calibration: overcoming your overconfidence
The good news is that you can train yourself to improve your estimation of possibilities and to reduce your overconfidence.
Douglas Hubbard2 has carried out research into how well a number of techniques for improving estimation abilities actually work and he found that 90% of people can improve their accuracy in just half a day. The four techniques that are most effective when combined are described below and will then be used in a case study to improve our estimation for Einstein’s Nobel Prize.
For each estimate imagine that you are betting $1000 on the answer being within your 90% CI. Now compare this to betting $1000 on a spinner where 90% of the time you win and 10% of the time you lose. Would you prefer to take a spin? If so, your range is too small and you need to increase it. If you decide to answer the question your range is too large and you need to reduce it. If you don’t mind whether you answer the question or take a spin then it really is your 90% CI.
Start with an absurdly large range, maybe from minus infinity to plus infinity, and then begin reducing it based upon things you know to be highly unlikely or even impossible.
Anchoring3 occurs when you think of a single answer to the question and then add an error around this answer; this often leads to ranges which are too narrow. Using the absurdity test is a good way to counter problems brought on by anchoring; another is to change how you look at your 90% CI. For a 90% CI there is a 10% chance that the answer lies outside your estimate, and if you split this there is a 5% chance that the answer is above your upper bound and a 5% chance that the answer is below your lower bound. By treating each bound separately, rephrase the question to read ‘is there a 95% chance that the answer is above my lower bound?’. If the answer is no, then you need to increase or decrease the bound as required. You can then repeat this process for the other bound.
Pros and cons:
Identify two pros and two cons for the range that you have given to help clarify your reasons for making this estimate.
Once you have used these techniques you can make another equivalent bet to check whether your new estimate is your 90% CI.
The important thing with these techniques is to practice them, testing yourself on further questions4, assessing yourself as you go and attempting to improve between tests. Over time you should see that your estimates for a 90% CI move closer to being correct 90% of the time.
Once you reach the point at which your 90% CI’s are correct 90% of the time you can consider yourself calibrated.
For a more in depth explanation on how to calibrate yourself as well as even more questions with which to test yourself we would recommend looking at Hubbard’s book, How to Measure Anything.
Case Study: When did Einstein win his Nobel Prize?
At the beginning of this post we initially estimated that Einstein won his Nobel Prize between 1925 and 1945, but we will now apply the calibration techniques to this estimate to check it and if possible improve our 90% CI.
As well as following the flowchart above we also note two additional pros and cons:
* Pro: Einstein was influential in the interwar period, advising Roosevelt to set up the Manhattan Project.
* Pro: Period falls between two points at which we know he wasn’t alive.
* Con: Some of his theories have been tested in manned space flight, so Einstein may have won his Nobel after experimental results were obtained.
* Con: We aren’t sure that he wasn’t still alive and working in the period between 1960 and 1989.
After taking this into account we produce our final 90% CI of 1890-1960. In fact, Einstein won his Nobel Prize in Physics in 1921, very close to the centre of our estimated range.
How to use estimates
We have now covered how to make an estimate for simple problems as well as learning the four techniques which can ensure that our estimate is accurate. In Part II we will focus on how to combine estimates to deal with more complex issues as well as how to compare two different estimates when making a decision.
You may also enjoy
Notes and References:
- Answers to calibration questions:
10. Sep 26, 1964 ↩
- Hubbard, D. W. (2010) Calibrated Estimates: How much do you know now? In How to Measure Anything, 2nd Edition, Hoboken: John Wiley & Sons, Inc. ↩
- Fischhoff B., Phillips L. D. and Lichtenstein S.(1982) “Calibration of Probabilities: The State of the Art to 1980,” in Judgement under Uncertainty: Heuristics and Biases, New York: Cambridge University Press ↩
- Further Questions:
1. How many feet tall is the Hoover Dam?
2. How many inches long is a 20-dollar bill?
3. What percentage of aluminium is recycled in the USA?
4. When was Elvis Presley born?
5. What percentage of the atmosphere is oxygen by weight?
6. What is the latitude of New Orleans?
7. In 1913, the U.S. military owned how many aeroplanes?
8. The first European printing press was invented in what year?
9. What percentage of all electricity consumed in U.S. households was used by kitchen appliances in 2001?
10. How many miles tall is Mt. Everest?
11. How long is Iraq’s border with Iran in km?
12. How many miles long is the Nile?
13. In what year was Harvard founded?
14. What is the wingspan (in feet) of a Boeing 747?
15. How many soldiers were in a Roman legion?
16. What is the average temperature of the abyssal zone (where the oceans are more than 6,500 feet deep) in degrees F?
17. How many feet long is the Space Shuttle Orbiter (excluding the external tank)?
18. In what year did Jules Verne publish 20,000 Leagues Under the Sea?
19. How wide is the goal in field hockey (feet)?
20. The Roman Coliseum held how many spectators?
2. 6 3/16ths
16. 39 oF
20. 50,000 ↩