EPID 600; Class 8 Bias University of Michigan School of Public Health
1
Bias Systematic error in the design, conduct or analysis of a study that results in a mistaken estimate of an exposure’s effect on disease
2
Bias Systematic error in the design, conduct or analysis of a study that results in a mistaken estimate of an exposure’s effect on disease
Wrong study design! Wrong sampling strategy!
3
Bias Systematic error in the design, conduct or analysis of a study that results in a mistaken estimate of an exposure’s effect on disease
Problems in enrollment of cases, of controls! Loss to follow-up! Poor collection of data! 4
Bias Systematic error in the design, conduct or analysis of a study that results in a mistaken estimate of an exposure’s effect on disease
Wrong modeling assumptions! Miscategorization of variables! 5
Rothman KJ. Epidemiology: An Introduction. Oxford, 2002.
6
Evaluating bias 1.
Why did it occur?
2.
What effect does it have on the observed association?
3.
What can be done to control for bias in this study and to prevent it in future studies?
7
Types of (important) bias 1.
Selection bias Error in selection of study participants
2.
Information bias Errors in procedures for gathering relevant information
8
1. Selection bias Systematic error in selecting subjects into one or more of the study groups, such as cases and controls, or exposed and unexposed
9
Study question Does coffee drinking cause pancreatic cancer?
10
Selection Bias: in a case-control study Cases: patients hospitalized with a diagnosis of pancreatic cancer Controls: patients hospitalized for other reasons by the same gastroenterologist who had hospitalized the case Results: found a strong relationship between coffee drinking and pancreatic cancer
11
What happened?
POPULATION
Persons who do not drink coffee are more likely to be controls
Cancer No
Yes No Cancer Yes
Coffee
Coffee
Yes
No
Yes No STUDY SAMPLE 12
Study question Is there a relation between occupational exposure to asbestos and lung cancer?
13
Selection Bias: in a cohort study Exposed: workers who handle asbestos (100% participation) Unexposed: workers in other areas of the factory who agree to participate (50% participation) Results: found NO relationship between asbestos and lung cancer
14
What happened? UNEXPOSED workers who participate are those at high risk for lung cancer, so unexposed with disease are overrepresented
POPULATION Cancer No
Yes No
Cancer
Asbestos
Asbestos
Yes
Yes
No
Yes No STUDY SAMPLE 15
2. Information Bias Systematic error in obtaining information regarding subjects in the study Examples: bias in recall, in collecting data, in interview, in reporting
16
Study question Is perinatal infection associated with a risk of congenital malformation?
17
Information Bias in a case-control study: Example 1 Cases: newborns with congenital malformations Controls: healthy newborns Results: found a strong relationship between mother’s recall of infection during pregnancy and malformation
18
What happened? Recall bias Parents of children with congenital malformations were more likely to report infection during pregnancy than parents of children without congenital malformations
19
What happened?
Yes No
Congenital Malformation Yes Infection during pregnancy
Infection during pregnancy
POPULATION Congenital Malformation Yes No
No
Yes No STUDY SAMPLE
20
What happened?
Misclassification of unexposed as exposed is more common in cases than in controls DIFFERENTIAL MISCLASSIFICATION
Yes No
Congenital Malformation Yes Infection during pregnancy
Infection during pregnancy
POPULATION Congenital Malformation Yes No
No
Yes No STUDY SAMPLE
21
What happened?
Yes
Misclassification of unexposed as exposed is more common in cases than in controls DIFFERENTIAL MISCLASSIFICATION
No
Congenital Malformation Yes Infection during pregnancy
Infection during pregnancy
POPULATION Congenital Malformation Yes No
No
Yes No STUDY SAMPLE
22
What if there is misclassification and it is similar in both cases and controls ?
Infection
Case Non-Case Yes No
Non-differential misclassification Usually biases estimate of association towards 1 (the null) 23
“Toward the null”
3
2 ”the null” 1 0.5 0 24
Study question Is smoking associated with an increased risk of myocardial infarction (MI) ?
25
Information Bias in a case-control study: Example 2 Cases: hospitalized cases of MI in elderly adults Controls: elderly adults, randomly selected from the community, who have never been hospitalized for MI Results: found a weak relationship between smoking and MI
26
What happened? Many true cases of MI are misclassified as non-cases, and are included in the controls (they were not hospitalized and had no symptoms)
27
What happened?
Misclassification of cases as controls is similar in smokers and nonsmokers NON-DIFFERENTIAL MISCLASSIFICATION
Yes No
Myocardial Infarction Yes Smoke
Smoke
POPULATION Myocardial Infarction Yes No
No
Yes No STUDY SAMPLE
28
What happened?
Misclassification of cases as controls is similar in smokers and non-smokers NONDIFFERENTIAL MISCLASSIFICATION
Yes No
Myocardial Infarction Yes Smoke
Smoke
POPULATION Myocardial Infarction Yes No
No
Yes No STUDY SAMPLE
29
What happened?
Misclassification of cases as controls is similar in smokers and non-smokers NONDIFFERENTIAL MISCLASSIFICATION
Yes No
Myocardial Infarction Yes Smoke
Smoke
POPULATION Myocardial Infarction Yes No
No
Yes No STUDY SAMPLE
30
Study question Is use of oral contraceptives (OC) associated with an increased risk of venous thrombophlebitis (blood clots)?
31
Information Bias: in a cohort study Exposed: women who use OC Unexposed: women who do not use OC Results: found a strong relationship between OC use and thrombophlebitis
32
What happened? Detection bias (also called surveillance bias) Women who are on oral contraceptives are more likely to receive a diagnosis of thrombophlebitis
33
What happened?
POPULATION
Yes No Thrombophlebitis Yes OC Use
OC Use
Thrombophlebitis Yes No
No
Yes No STUDY SAMPLE
34
What happened?
Misclassification of non-disease as disease is different in exposed and unexposed persons DIFFERENTIAL MISCLASSIFICATION
POPULATION
Yes No
Thrombophlebitis Yes OC Use
OC Use
Thrombophlebitis Yes No
No
Yes No STUDY SAMPLE
35
What happened?
Misclassification of non-disease as disease is different in exposed and unexposed persons DIFFERENTIAL MISCLASSIFICATION
POPULATION
Yes No
Thrombophlebitis Yes OC Use
OC Use
Thrombophlebitis Yes No
No
Yes No STUDY SAMPLE
36
Putting numbers to the differential vs. nondifferential examples, 1 Misclassification of non-disease as disease is different in exposed and unexposed persons DIFFERENTIAL MISCLASSIFICATION RESULTING IN BIAS AWAY FROM THE NULL
POPULATION
OC Use
Thrombophlebitis Yes No Yes No
50 50
25 100
Thrombophlebitis Yes No
(100*50)/ (50*25)=4
OC Use
REAL OR = Yes No
70 50
5 100
STUDY SAMPLE
BIASED OR = (100*70)/ (50*5)=28 37
POPULATION Congenital Malformation Yes No
50 No 50
Yes
REAL OR = (100*50)/ (50*25)=4
Misclassification of unexposed as exposed is more common in cases than in controls DIFFERENTIAL MISCLASSIFICATION RESULTING IN BIAS AWAY FROM THE NULL
25 100
Congenital Malformation Yes
Infection during pregnancy
Infection during pregnancy
Putting numbers to the differential vs. nondifferential examples, 2
75 No 25
Yes
No
25 100
STUDY SAMPLE
BIASED OR = (100*75)/ (25*25)=12 38
Putting numbers to the differential vs. nondifferential examples, 3 Misclassification of exposed as unexposed is more common in cases than in controls DIFFERENTIAL MISCLASSIFICATION RESULTING IN BIAS TOWARDS THE NULL
POPULATION Disease Exposure
Yes
50 No 50
Yes
No
25 100
Disease
(100*50)/ (50*25)=4
Yes Exposure
REAL OR =
Yes No
25 75
No
25 100
STUDY SAMPLE
BIASED OR = (100*25)/ (25*75)=1.3 39
Putting numbers to the differential vs. nondifferential examples, 4 Misclassification of cases as controls is similar in smokers and non-smokers NONDIFFERENTIAL MISCLASSIFICATION RESULTING IN BIAS TOWARDS THE NULL
Smoke
POPULATION Myocardial Infarction Yes No
50 No 50
Yes
25 100
Myocardial Infarction
REAL OR = (50*25)=4
Smoke
(100*50)/
Yes
25 No 25
Yes
No
50 125
STUDY SAMPLE
BIASED OR = (125*25)/ (25*50)=2.5 40
Accuracy of weight/height reports Obesity is acknowledged as a critical health problem internationally Studies often use reported (as opposed to measured) data to estimate the prevalence of overweight and obesity at the population level There have been investigations regarding the “truth” of these reported values in adults and adolescents; the validity of parent-reported weight and height was studied by a team in Canada.
Dubois and Girad. Accuracy of maternal reports of pre-schoolers’ weights and heights as estimates of BMI values. Int J Epid. 2007; 36: 132-138.
41
Height/weight reports
1) Mothers asked to report on height and weight of children aged 4
2) Within 3 months, children’s weight and height were directly measured
3) Investigators examined the prevalence of obesity based on reported values versus prevalence of obesity based on measured values
Dubois and Girad. Accuracy of maternal reports of pre-schoolers’ weights and heights as estimates of BMI values. Int J Epid. 2007; 36: 132-138.
42
Height/weight reports The cohort: 4-year old children in 2002, who were part of a regional stratified sample of children born in Quebec in 1998 Height/Weight report: One care-giver, usually the mother, reported height and weight to an interviewer; the caregiver was not told that subsequent measurement would be taken. Interviewers made sure that mothers recalled these values rather than measuring them on the spot Height and weight measurement: Within three months of the interview, nutritionists followed a standardized protocol and measured height and weight of children
Dubois and Girad. Accuracy of maternal reports of pre-schoolers’ weights and heights as estimates of BMI values. Int J Epid. 2007; 36: 132-138.
43
Height/weight report: is it the same for all? Is any group of people consistently overreporting BMI of children? Odds ratios among boys: BMI>95th Percentile SES
Reported
Measured
Highest
1
1
Middle
1.8
1.7
Lowest
2.2
1.9
Dubois and Girad. Accuracy of maternal reports of pre-schoolers’ weights and heights as estimates of BMI values. Int J Epid. 2007; 36: 132-138.
44
Height/weight reports In this figure, the measured weight is 17 kg for a 51month-old child who is 1.03m tall. This child ranks at the 71st percentile if the child is a girl and at the 65th percentile if the child is a boy. If the mother reports the weight as being 2 kg less than the actual value, the child would be classified as being below the 15th percentile.
Dubois and Girad. Accuracy of maternal reports of pre-schoolers’ weights and heights as estimates of BMI values. Int J Epid. 2007; 36: 132-138.
45
Height/weight report: findings Heights were reported more accurately than weights (there was no difference in the means of reported vs. measured heights) A greater proportion of mothers overestimated boys weights; a greater proportion of lower SES mothers misreport 12% of the children were classified as overweight based on the reported data; 9% were classified as overweight using measured data 3% overestimation of overweight in this population
Dubois and Girad. Accuracy of maternal reports of pre-schoolers’ weights and heights as estimates of BMI values. Int J Epid. 2007; 36: 132-138.
46
Special biases Non-respondent bias Persons who do not participate in a particular study may be different than those who do e.g., in telephone surveys, women are more likely to answer surveys than are men; if the exposure of interest is differentially distributed between women and men and if gender is associated with the outcome of interest bias will result
47
Other special biases Unmasking (detection signal) bias Membership bias Diagnostic suspicion bias Exposure suspicion bias Recall bias Family information bias Neyman bias Berkson bias etc 48
Evaluating Bias 1.
Why did it occur?
2.
What effect does it have on the observed association?
3.
What can be done to control for bias in this study, and to prevent it in future studies?
49
Preventing Bias Careful attention to sampling Minimize non-response Standardization of measurements Training and quality control Blinding
50
51