What is bias?
Bias is a disproportionate weight in favour of or against an idea or a thing, usually in a way that is prejudicial, unfair or closed- minded. Bias can be innate or learned. People may develop biases for or against one person or group, or a belief. In science bias is a systematic error.
There are several ways in which bias can occur. Examples of Bias include:-
- Attribution bias
- Hala effect and Hon effect
- Self-serving bias
- status Quo bias
- Statistical bias
The list is quit long and this article does not attempt to cover all the bias. We have selected a few that we felt need the attention of a data science enthusiast.
For this article, we will focus on statistical bias. Bias in statistics can be defined as the tendency of a statistic to overestimate or underestimate a parameter.
TYPES OF STATISTICAL BIAS
The list of statistical bias types is so long that we can not cover in a single post. We will delve in the following bias that you frequently encounter as a data analyst.
Observer-expectancy effect is also known as experiment-expectancy effect, observer effect, or experiment effect is a common type of bias. This is a form of reactivity in which a researcher subconsciously projects their expectations onto the research. If the researcher has a pre-determined outcome in mind, they are more likely to ignore data that opposes their hypothesis. example, if a journalist believes that a certain area thrives in drug trafficking, they are more likely going to ignore any data that counters their assertions and amplify any evidence supporting it.
SOCIAL DESIRABILITY BIAS
Social desirability bias is a type of bias where the survey respondents give answers which are viewed favorable by society. It can take the form of over-reporting, under-reporting, or undesirable behavior. Example, if you gave questionnaires to people in the transport sector and asked about their drinking habits, drivers would most likely indicate that they do not take alcohol since this is what society expects of them.
Reporting bias means that only a selection of results are included in any analysis, which covers only a fraction of relevant evidence. This arises when the researcher chooses to ignore a fraction of the results that he/she feels is not necessary to include in the final report. The report is thus biased since it does not cover all the observations.
Selection bias is the bias introduced by the selection if individuals or groups in a study differ systematically from the population of interest leading to a systematic error in an association or outcome. It is sometimes referred to as selection effect. A good example would arise if say a researcher seeking to know economic status of country uses a small urban town as the sample for his/her research and generalizes his/her finding to be true for the entire country.
Survivorship bias is a common logical error that distorts our understanding when a visible successful subgroup is mistaken as an entire group. This is a very common type of bias especially in this era of social media where people hide their failures and only publicize their success. A good example would be if 10 very successful businessmen admitted to having dropped out of school, this will hide the fact that thousands of others who also dropped out of school did not become successful. Someone influenced by this bias would be made to conclude that dropping out of school makes people successful.
OMITTED VARIABLE BIAS
Omitted variable bias occurs when a statistical model leaves out one or more relevant variables. If a researcher decides to ignore a variable which is critical, then the model that is created will not reflect the true facts on the ground. In other words, the model will be misleading.
Recall bias is another common type of bias. This is a systematic error caused by differences in the accuracy in remembering and relaying the information. This bias arises when we rely on a person’s memory as a source of data. Take for example, if you were invited to a meeting but you could not make it. A week later you ask Johny (your best friend and coworker) how the meeting went, the information you will receive will entirely depend on his memory hence a bias arises.
Funding bias is one of the common types of statistical bias. It refers to when a study’s outcome is more likely to support the interest of the organization funding the study. Research has shown that if a person or organization funds another person or organization carrying out a research, then the funded party will feel that they owe the sponsors a debt. The researcher is most likely to skew the outcome in the interest of the sponsors.
Do you have another type of bias that you feel should be on the list, feel free to let us know in the comments section. Let us have your thoughts on how these bias affect your day to day.