Selection bias is an effect where choosing incorrect participants or data leads to inaccurate predictions or results.
The results of 1936 US Presidential elections surprised the whole world. That was the year Alfred Landon was contesting against Franklin D. Roosevelt.
Back then, conducting polls was not as easy as it is today. During those times, a well-known magazine called Literary Digest had a history of accurately predicting some of the prior results using their polls.
They wanted to get it right again, so they left no stone unturned. The magazine conducted the most expensive poll ever with over 2.4 million responses. Reaching out to those many people in the 1930s was a herculean task and costed a ton of money. Nobody had social media or email, remember?
After tabulating all the numbers they gathered, they estimated Landon would win 57% of the votes while Roosevelt would only garner 43%. The news of a resounding victory for Landon spread across the nation.
But guess what the actual results were? Roosevelt not only won, but triumphed with 62% of the votes against Landon’s 38%.
Are you thinking, “Literary Digest must have fudged the data or lied about 2.4 million responses”? No, they didn’t. No foul play was involved.
The problem was the selection process used for picking the participants of the poll. The 2.4 million who responded did not represent the real US population.
Literary Digest had reached out to 10 million citizens based on the telephone directory, club memberships, and magazine subscribers. But, in 1936, the country had just come out of depression. Most people did not enjoy a comfortable living like today. Due to the criteria used by the magazine, the participants were well to do or rich.
But that was not the real story of the entire US population, which was around 125 million at that time. A sizable chunk constituted the poor, the unemployed, and those struggling to make ends meet.
The rich who were a minority voted for Landon, and the poor who were the majority chose Roosevelt.
Such an error where the wrong choice of participants leads to incorrect predictions is called selection bias. The poll was also influenced by the nonresponse bias, where only about 25% of the people responded. The higher the percentage of nonresponses, the greater the chances of an error.
In this article, I will cover:
- What is the selection bias?
- The different types of the bias
- Examples of how it can affect you in daily life
- How to avoid erroneous decisions due to the bias
- What is the selection bias?
- Types of selection bias:
- Real-life examples of selection bias:
- How to avoid selection bias
What is the selection bias?
Selection bias is an effect where the wrong choice of data or participants can unintentionally cause incorrect predictions or poor decisions. Even if you have all the right intentions, you need the right information to make the right choice.
Though the bias primarily affects polls and research, as per psychology it can also influence the decisions you make in real life. It is also called bias by selection and is closely related to information bias.
So what constitutes the right information or participants? If you’re trying to predict a result/behavior or making a decision for 1000 people based on the input of 20, you must pick the 20 such that they represent the entire population as closely as possible.
If that’s confusing, the upcoming examples will explain how the bias works.
You can read about all the cognitive biases of the human mind.
Types of selection bias:
The 1936 US elections uncovered only one type of selection bias. Let us go through the other kinds with relatable examples.
1. Sampling bias
This is a type of selection bias where the data chosen is not a real indication of the entire population. The US Presidential elections of 1936 fall under the category of sampling bias. Such errors occur when you fail to make a thoughtful choice when picking the data.
The process of choosing participants of a sampling bias looks like this:
Let me give you a real-life example. Assume you are a researcher who has to determine if the majority of the human beings prefer spicy food. If you choose 1000 people, does that guarantee accurate results? Not at all.
If you pick people from Indian cities, 8 out of 10 people will sing praises of spicy food and how much they relish eating it. If you ask the French, most people will turn red-faced and teary-eyed at the thought of hot flavors. The number of people you choose has little to do with the accuracy of the results.
To perform such a test that comes close to the real answer, you will have to consider different cultures, age brackets, upbringing, and several other factors I am not even aware of.
A special attention is provided to such sampling in epidemiology(a branch of medicine which controls diseases). Any incorrect results with medicine can lead to devastating consequences.
The timing of selection and the duration you perform the test for can also skew the results.
Try going out on the streets today and ask people, “Do you fear shaking hands with strangers?” You already know what to expect due to the current situation. But what if you had asked the same question 2 years ago? The responses would portray a different story altogether.
The timing of your test can lead to a different outcome.
You might feel tempted to stop the research early when you find the most of the early information pointing towards one result.
The process of coming to an early conclusion looks like this:
Let’s assume you are a scientific researcher. You have to determine if you can train monkeys by hurting them every time they disobey you. If 4 out of the first 5 monkeys you test react positively to your study, you end the experiment with an affirmative result because you do not want to whip those little chimps anymore.
Ethical reasons, cost factors, unforeseen circumstances can urge the early stoppage of an experiment. But not performing a test long enough can lead to inaccurate results.
Please note: I love animals and would be the first to call an early stop to the exercise of merciless beating.
3. Indirect causes
Failing to consider the effect of one situation on another causes distorted results too.
When one disease leads to a subsequent ailment, tests can show strong proof that the treatment of the first disease led to the second. But that isn’t necessarily true. From a real example in medical science, postmenopausal syndrome increases the chances of endometrial cancer.
If you go by tests alone, you’ll hold the doctor accountable for prescribing medicine to a patient suffering from the postmenopausal syndrome. That will be a failure to consider the influence of other causes behind the problem.
4. Cherry-picked data
On many occasions, you pick data such that you want the results to show a desired outcome or to match your belief. You intentionally or unintentionally handpick the cases or participants such that you reinforce your original assumption.
The process of selecting choosing data looks like this:
If you worry about dying in a flight crash, you hunt for all the airline disasters in the last few years. The stories send a shiver down your spine. Your fear influences you to look at tragic events alone, but you fail to consider a million other flights that landed safe and sound.
Though you did not pick the wrong data on purpose, you broke a sweat for no reason. According to statistics, the chances of dying on the way to the airport due to a road accident are much higher. If you can sit in a car without losing your mind, there is no reason why you should fear flying.
Let’s say you’re trying to find data to prove who is a better politician. When you have a favorite of your own, you tend to find participants who are more inclined towards your choice. For example, you contact certain groups or people from specific locations where the politician’s popularity is higher. You finally publish the results of your research as “77 out of 100 people believe John is the best leader.”
For a naive reader, the research appears real because you have not disclosed your sources. In reality, you have tilted the study behind the scenes such that you arrived at the result you desired.
Intentional selection bias occurs in the following cases:
- when you have to prove a point
- when you have a vested interest towards one outcome
- when you’re stubborn about your opinion
Real-life examples of selection bias:
The different examples cited above indicate that the selection bias applies only to research, experiments, surveys, and polls. But you are vulnerable to the effect in your daily life too. Here are a few real-life examples:
1. Asking for feedback:
When you ask for feedback, different factors influence what people tell you.
Whom you ask:
The feedback you receive depends on whom you ask. If you ask your mother to rate your looks, she won’t tell you the truth even if you look like an ugly duckling. Similarly, if you ask your friends what they think about the first book you wrote, most people will tell you, “it was good.”
How you ask:
The method you use to seek feedback can determine what people tell you. Assume you are the boss of a 50 member team. If you ask the employees to rate your leadership skills where you know what each person rated you, the results will never reflect reality. The team members will fear the backlash of giving you a poor rating.
What if you asked for feedback on an anonymous survey? Your people are more likely to provide genuine answers.
Selection bias is common among entrepreneurs. When you believe in an idea, you assume your business will go on to become the next big thing, shoot you to fame, and add millions of dollars to your bank balance.
When you have to validate your idea in the early stages, you start looking at similar successful businesses. You believe that the data is suggesting you pursue the idea and go all in.
But, you do not look at other comparable businesses that failed. You might even disregard evidence against your idea as invalid or an exception due to the confirmation bias.
Countless entrepreneurs have failed by cherry-picking research data to match their belief.
3. Taking decisions in groups:
When you are making decisions based on what others are doing, you might interpret a false reality.
If your friends have the habit of partying all the time and buying the most expensive brands, you start believing that savings are unnecessary. On the other side, if you live with penny pinchers, you try to save the last cent and lose out on enjoying life.
The behavior of those around you can influence many of your habits like working out, risk-taking, investment methodologies, and more. If you follow a trend based on your surroundings or close acquaintances, you have failed to look at real stories.
How to avoid selection bias
You can reduce the effects of the selection bias by choosing diverse data. When you decide on a criteria, spend time evaluating if you have considered unexpected factors that might influence the results. When the data or participants you pick are genuinely random, you will arrive at the correct outcome.
But, it is easier said than done. Even with all the good intentions and careful thought, you can still end up with the wrong result. Your lack of awareness can lead to unintentional selections without your knowledge. In the example of endometrial cancer, a non-medical practitioner would fail to consider the possibility of such indirect causes. You can do little about fixing such gaps other than increasing your expertise. But again, that isn’t always feasible.
In some instances, there is a need for lack of transparency to ensure accurate results. If you are trying to determine the which jam among 10 choices do you like the most, your most sincere choice is uncovered if you perform a blind test. When you know what you’re trying, your favorite brand or flavor can sway your mind to pick the usual.
Selection bias influences us in big ways and small. Often, its consequences are insignificant, but at times, you might make a horrendous decision.
You do not have to consider the effect of selection bias for every little choice you make. When you are making a decision that can have tremendous consequences, spend time assessing if the data you’re using is reliable. A simple validation might just help you avoid an irreparable mistake.
Have you made a wrong decision due to the selection bias? Leave a comment.
Maxim Dsouza has spent over a decade experimenting and finding various time management techniques to improve his productivity. He strongly understands the fact that time is a limited commodity and tries to make every second count. He has extensive experience in leadership in startups, small businesses, and large corporations.
He has helped people of different professions and age groups gain clarity on their goals, improve focus, revise their time management skills and develop an awareness of their psychological cognitive biases.