EPID 600; Class 4 Measures of association University of Michigan School of Public Health
1
Three key dimensions to epidemiologic studies Measures of association Relative measures (relative risks, rates, and odds) Absolute measures (risk and rate differences) Study design Observational Cohort Case-control Cross-sectional Experimental Randomized trial Field trials Group randomized trials Units of analysis Individual Group 2
Three key dimensions to epidemiologic studies Measures of association Relative measures (relative risks, rates, and odds) Absolute measures (risk and rate differences) Study design Observational Cohort Case-control Cross-sectional Experimental Randomized trial Field trials Group randomized trials Units of analysis Individual Group 3
Measurement of association Epidemiologic studies strive to determine the difference in measures of disease occurrence between populations Populations typically considered as “exposed” vs “unexposed” and measures of association then seek to define an association between “exposure” and disease “outcome” of interest Measures of association reflect statistical relations between variables, they are not measures of “effect” which are unobserveable counterfactual contrasts, but they are the best we can do 4
The world
persons “exposed”
persons “unexposed”
5
The epidemiologic study
persons “exposed”
persons “unexposed”
6
The epidemiologic study
persons “exposed” with disease
persons “unexposed” with disease
7
Reminder...prevalence (proportion)
Number of cases Prevalence
= Number of persons in population
at a specified time
8
Prevalence ratio
prevalence ratio =
prevalenceexp osed prevalenceun exp osed
Prevalence ratio is uncommonly used in epidemiology due to limitations of prevalence (including both incidence and duration of disease) discussed in 9 class 3
Reminder...risk (incidence proportion) The probability that a person will develop a given disease
Risk =
Number of new cases of disease Number of persons followed
over a time period
10
Relative risk (risk ratio) The ratio of risks for two populations
RR =
Rexp osed Run exp osed
Ranges from 0 to +∞ , has no units
11
Risk difference The additional risk among those exposed when compared to those unexposed
RD = Rexp osed − Run exp osed Ranges from -1 to +1, has no units
12
Reminder...incidence rate
Number of new cases Incidence Rate = Total time at risk of persons followed
13
Relative rate (incidence rate ratio) The ratio of rates for two populations
IRR =
IRexp osed IRun exp osed
Ranges from 0 to +∞ , has no units
14
Rate difference The additional incidence rate comparing those exposed vs. those unexposed
IRD = IRexp osed − IRun exp osed Ranges from -∞ to +∞ , has unit of time-1
15
GI infection: what are the causes? Bacterial gastrointestinal infections cause considerable morbidity even in industrialized countries We’ve figured out that certain microbes produce illness in certain people – but what beyond that? Who gets those microbes? What determines who gets symptomatic GI infection? We start by looking for associations between the illness and factors of interest
16 Simonsen, Frisch, and Ethelberg. Socioeconomic Risk Factors for Bacterial Grastrointestinal infections. Epidemiology. 2008; 19(2):282-290
GI infection and SES: an association? Little is known about socioeconomic factors affecting the risk of infection in industrialized settings A group in Denmark got curious… What did they do? Link 3 national registries and follow the entire population of Denmark (5.3 million people) from 1993 to 2004 to track GI infection 17 Simonsen, Frisch, and Ethelberg. Socioeconomic Risk Factors for Bacterial Grastrointestinal infections. Epidemiology. 2008; 19(2):282-290
GI infection and SES RESEARCH PROCESS Identify a cohort of interest Find information on each individual’s SES Obtain information on their disease patterns
DATA SOURCE Danish Civil Registration System Integrated Database for Longitudinal Labor Market Research
Create extended 2x2 tables and do an analysis
National Registry of Enteric Pathogens 18
Simonsen, Frisch, and Ethelberg. Socioeconomic Risk Factors for Bacterial Grastrointestinal infections. Epidemiology. 2008; 19(2):282-290
GI infection and SES
19 Simonsen, Frisch, and Ethelberg. Socioeconomic Risk Factors for Bacterial Grastrointestinal infections. Epidemiology. 2008; 19(2):282-290
GI infection and SES These data provide evidence that higher SES is associated with Campylobacter infection
Cases
Person years (1000s)
Adjusted risk ratio
<100,000
6487
13,490
0.93
100,000-199,000
9718
21,604
1.00
200,000-299,999
5507
11,051
1.10
300,000-399,999
1190
2165
1.28
>400,000
639
1068
1.51
Income
We compare the risk of each income bracket to the median bracket (the reference category)
Simonsen, Frisch, and Ethelberg. Socioeconomic Risk Factors for Bacterial Grastrointestinal infections. Epidemiology. 2008; 19(2):282-290
20
Reminder...odds probability, or risk
p odds = 1− p
21
Relative odds (odds ratio)
pexp osed OR =
oddsexp osed oddsun exp osed
1 − pexp osed = pun exp osed 1 − pun exp osed 22
Absolute vs. relative scales The two types of effect measures we have articulated here are on an absolute scale (i.e., subtraction) and on a relative scale (i.e., division) In epidemiology we may be interested in both Absolute differences tell us the increase (or decrease) in effect Relative differences tell us the relative increase or decrease in effect comparing one quantity to another
23
Absence of an effect in the absolute scale If there is no effect on an absolute scale, the Risk Difference (RD), or the Rate Difference (IRD) are equal to 0 That is, there is no increased risk or increased rate of disease among exposed compared to unexposed Therefore, on an absolute scale, the “null” is 0
24
The relative effect on a relative scale The relative effect is equivalent to the proportion change in absolute effect among exposed compared to unexposed (e.g., if original amount is x, and new amount is y, the proportion increase y−x is x
Risk difference Relative effect
= Risk in unexposed
25
Therefore...
Relative = effect
Rexp osed − Run exp osed Run exp osed
=
Rexp osed Run exp osed
−
Run xp osed Run exp osed
= RR − 1
26
Implications When we talk about greater population risk of a particular outcome among exposed compared to unexposed, we should be using RR-1, not RR Typically, we present RR So, if RR=3, relative effect=3-1=2 So, if RR=3 we say, there is a 200% increase in risk of disease among exposed compared to unexposed So, NO EFFECT is 0, i.e., RR-1=0, i.e., RR=1 RR=1 is then the “null”
27
Key way to see through this All these formulas are related to one another in relatively simple ways that rest on understanding (not memorizing) what they mean and where they come from
28
Reminder...risk and incidence rate Risk = Incidence rate x time....therefore...
RR =
Riskexp osed Riskun exp osed
=
Incidenceexp osed * time Incidenceun exp osed * time
=
Incidenceexp osed Incidenceun exp osed
= IRR
29
Reminder...risk and incidence rate Risk = Incidence rate x time....therefore...
RR =
Riskexp osed Riskun exp osed
=
Incidenceexp osed * time Incidenceun exp osed * time
=
Incidenceexp osed Incidenceun exp osed
= IRR
if....time period is sufficiently comparable among exposed group and unexposed group; typically this is if the time period is short remember...we had said that R=IR*t when R is low therefore...RR is a reasonable approximation for IRR when both risk is low and when time period of observation is short
30
Epidemiologic confusion Sometimes epidemiologists use the term “relative risk” to refer to either risk ratio or to incidence rate ratio assuming the two are equivalent This is obviously wrong; please do not do that
31
The world
persons “exposed”
persons “unexposed”
32
The epidemiologic study
persons “exposed”
persons “unexposed”
33
The epidemiologic study
persons “exposed” with disease
persons “unexposed” with disease
34
The “2x2” table
Disease
No disease
Total
Exposed
a
b
a+b
Not exposed
c
d
c+d
Total
a+c
b+d
a+b+c+d
35
Relative risk, i.e., risk ratio
a Rexp osed = a+b c Run exp osed = c+d a RR = a + b c c+d 36
Relative odds, i.e., odds ratio a a a Pexp osed a a + b a + b a + b Oddsexp osed = = = = = a+b−a b 1 − Pexp osed 1 − a b a+b a+b a+b c c c Pune xp osed c c + d c + d c + d Oddsun xp osed = = = = = c+d −c d 1 − Pun xp osed 1 − c d c+d c+d c+d a Oddsexp osed a*d b OR = = = Oddsun exp osed c b * c d 37
Example In a particular study out of 100 exposed persons, 20 develop disease; out of 200 unexposed, 25 develop disease
Disease
No disease
Total
Exposed
a
b
a+b
Not exposed
c
d
c+d
Total
a+c
b+d
a+b+c+d
38
Example In a particular study out of 100 exposed persons, 20 develop disease; out of 200 unexposed, 25 develop disease
Disease
No disease
Total
Exposed
a
b
100
Not exposed
c
d
200
Total
a+c
b+d
300
39
Example In a particular study out of 100 exposed persons, 20 develop disease; out of 200 unexposed, 25 develop disease
Disease
No disease
Total
Exposed
20
b
100
Not exposed
25
d
200
Total
45
b+d
300
40
Example In a particular study out of 100 exposed persons, 20 develop disease; out of 200 unexposed, 25 develop disease
Disease
No disease
Total
Exposed
20
80
100
Not exposed
25
175
200
Total
45
255
300
41
Example Disease
No disease
Total
Exposed
20
80
100
Not exposed
25
175
200
Total
45
255
300
20 RR = 100 = 1.60 25 200
20*175 OR = = 1.75 25*80 42
Going back to an example T1
T2
T3
T4
T5
T6
T7
T8
T9
T10
T11
T12
T13
T14
T15
T16
T17
T18
T19
T20
TT
P1
14
P2
20
P3
11
P4
11
P5
20
P6
20
P7
10
P8
20
P9
2
P10
9
43
An example T1
T2
T3
T4
T5
T6
T7
T8
T9
T10
T11
T12
T13
T14
T15
T16
T17
T18
T19
T20
TT
P1
14
P2
20
P3
11
P4
11
P5
20
P6
20
P7
10
P8
20
P9
2
P10
9
44
An example T1
T2
T3
T4
T5
T6
T7
T8
T9
T10
T11
T12
T13
T14
T15
T16
T17
T18
T19
T20
TT
P1
14
P2
20
P3
11
P4
11
P5
20
P6
20
P7
10
P8
20
P9
2
P10
10
45
Cohort approach 2 2 IRexp osed (14 + 20 + 10 + 2) IRR = = = 46 = 4.0 1 1 IRun exp osed (20 + 11 + 11 + 20 + 20 + 10) 92 2 Rexp osed RR = = 4 = 3.0 Run exp osed 1 6 pexp 0.5
1 − pexp 2*5 1 − 0.5 OR = = = 5.0 also can be calculated as = 5.0 pun exp 0.167 1* 2 1 − pun exp 1 − 0.167 46
Notes As in this example, OR is greater than RR when OR and RR are > 1 OR approximates RR when disease is rare (<1% typically)
47
Why? (first premise) a RR = a + b c c+d a a is always < a+b b c c is always < c+d d a a c a +b and if > 1, then, > , and c a+b c+d c+d
Disease
No disease
Total
Exposed
a
b
a+b
Not exposed
c
d
c+d
Total
a+c
b+d
a+b+c+d
a a b > a +b c c d c+d 48
Why? (second premise)
a a + b RR = c c+d
Disease
No disease
Total
Exposed
a
b
a+b
Not exposed
c
d
c+d
Total
a+c
b+d
a+b+c+d
if disease is rare, then a + b ≅ b and c + d ≅ d a a*d b therefore, RR ≅ ≅ ≅ OR c b*c d
49
Some notes about terminology... For OR, RR, and IRR, if value is >1 typically we say that there is a “positive association”, 1 is no association, and < 1 is a “negative association” Of course, interpretation fully depends on what is “exposed” and what is “non-exposed” Remember...the “null” is 1 for relative measures of association and 0 for absolute measures; hence “away from” or “towards” the null
50
The “2x2” table involving time
Disease
Time
Exposed
a
T1
Not exposed
c
To
Total
a+c
T1+To
51
Incidence rate ratio
IRexp osed
a = T1
IRun exp osed
c = T0
a T1 IRR = c T0 52
Example In a particular study 20 smokers out of 10,000 PY of exposure developed heart disease and 25 nonsmokers out of 20,000 PY of follow-up develop heart disease
Disease
Time
Exposed
a
T1
Not Exposed
c
To
Total
a+c
T1+To
53
Example In a particular study 20 smokers out of 10,000 PY of exposure developed heart disease and 25 nonsmokers out of 20,000 PY of follow-up develop heart disease
Disease
Time
Exposed
20
10,000
Not Exposed
25
20,000
Total
45
30,000
54
Example Disease
Time
Exposed
20
10,000
Not Exposed
25
20,000
Total
45
30,000
20 10, 000 IRR = = 1.6 25 20, 000
55
Attributable fraction among exposed
Proportion of the disease burden among exposed people that is due to the exposure
AFexposed =
R exposed -R unexposed R exposed
56
And...
AFexposed =
R exposed -R unexposed R exposed
R exposed R unexposed 1 RR-1 = =1= R exposed R exposed RR RR
so....if RD is the R among exposed when subtracting R among unexposed, then dividing RD by R among exposed gives the proportion of effect among exposed that is due to exposure this is often interpreted as the proportion of disease cases among exposed that would be removed if there were no longer any exposure note, that among exposed, we do NOT remove ALL of effect, even if exposure is not longer there WHY?....clearly “exposure” is not the only cause 57
Attributable fraction in population Proportion of the disease burden among the whole population that is due to the exposure
AFpopulation =
R population -R unexposed R population
so....if subtracting the R among unexposed from overall population R gives us the effect, then dividing this by R among population gives the proportion of effect among population that is due to exposure this is often interpreted as the proportion of disease cases in the population that would be removed if there were no longer any exposure
58
And...
p*(RR-1) AFpopulation = p*(RR-1)+1 where p is the prevalence of exposure in the whole population so...if the population attributable fraction is 20%, then if exposure is removed, we would expect that disease would be reduced by 20% in the population
59