Pharmacoeconomics & Health Outcomes
Stats Overview (Seminar Survival Series….) Leon E. Cosler, R.Ph., Ph.D. Associate Professor of Pharmacoeconomics Albany College of Pharmacy
Statistical Road Map 1. Descriptive Statistics (overview) • the ‘middle’ and the variation • DI notes… > everything you need to know!
2. Inferential Statistics • hypothesis testing and errors
3. Discuss sample size calculations 4. Techniques specific to economic studies
Research Methods: ECHO model 1. Clinical “Intermediate” clinical indicators versus long term clinical outcomes • Analytical methods differ Dx
Clinical Outcome (long term)
Intermediate Clinical Indicator
HTN
Fewer deaths due to MI
Improved control of BP Improved control serum cholesterol
Diabetes
Reduced incidence of Improved control of blood glucose ESRD; neuropathies, etc. Improved Hgb A1c
HIV
Reduced incidence of OI Improved CD4 counts Improved survival times Reduced Viral load counts
Research Methods: ECHO model 2. Economic
- all relevant cost categories - indirect costs include ability to work 3. Humanistic
- Health related quality of life (HRQOL) - Patient satisfaction - (Valuing these are difficult & contentious) • more on this later…
Scales of Measurement •
Four levels of measurement
- Nominal variable→ no implied rank or order. • Example: presence or absence of a disease
- Ordinal variable→ an implied order of rank. • Example: pain scale (categories)
Discrete Variable
- Interval variable→ defined units of measurement. • There is an equal distance or interval between values. • Example: temperature
- Ratio variable→ defined units of measurement. • Same as interval but has an absolute zero.
Continuous Variable
• Example: No. of cigarettes smoked per day
Type of variable is a determining factor in selecting the appropriate statistical test
Descriptive Statistics
What does the data look like?
• • •
Where is the middle? How spread out is it? Measure of central tendency
- often graphing the data good first step
•
Several measures of “central tendency” 1. Arithmetic mean =
xi
N
2. Median = 50% of obs. above & below 3. Mode = most common occurring value
Measure of central tendency
•
With a “perfect” normal distribution mean = median = mode !
www.snr.missouri.edu/natr211/examples/sample1.png
Measure of central tendency •
Nothing’s normal about the normal distribution
• •
Many types of data aren’t normal Clue: if mean & median are different… • you have a problem… • we have already seen an example of this
Income Distribution in the U.S. US Household Income: 2004
Median = $44,389 Mean = $70,402
79.9%
100% 90%
66.5%
80% 70%
Percent of US Households
54.1%
60% 50% 40% 30% 20% 10% 0%
1
$100,000 +
20.1%
$75,000 - $99,999
13.4%
$50,000 - $74,999
20.6%
$25,000 - $49,999
25.7%
< $25,000
20.2%
Source: URL: http://pubdb3.census.gov/macro/032005/faminc/new07_000.htm
A “skewed” distribution mode median mean
mode median mean
Measures of Variability •
Range
- Different between the highest data value and the lowest data value
- Ordinal, interval and ratio data
•
Inter-quartile range
- Data values within the 25th and 75th quartiles • Directly related to the median
- Less likely to be affected by extreme values in the data - Ordinal, interval and ratio data
•
Standard deviation
- Measure of the average amount by which each observation in a series of data points differs from the mean.
Variation: How spread out is the data?
•
Variation:
- ex: the standard deviation sd =
N ∑ x i − ( ∑ xi ) N ( N − 1) 2
2
- Ex: study reports expenditures Mean ± sd $9,105 ± $16,415
Variation
www.hemweb.com/library/images/curve.gif
Inferential Statistics
Inferential Statistics
•
Examine associations between variables
- does the intervention make a difference? - a statistically significant difference? - contrast to clinical significance…
•
Start with the ‘null hypothesis’
- there is no difference! - designated H0 - decision based on probabilities - sometimes we guess wrong…
Hypothesis Testing: Inferential Statistics
"Reality" I Decide:
There is NO Difference; Ho is true
There IS a Difference; Ho is false
There IS a difference; (Reject the Ho)
Error!
Correct Decision
There is NO difference; Do not reject the Ho
Correct Decision
Error!
Hypothesis Testing: Inferential Statistics
"Reality" There is NO Difference; Ho is true
There IS a Difference; Ho is false
There IS a difference; Reject the Ho
Type I Error; prob. = alpha
Correct Decision;
There is NO difference; Do not reject the Ho
Correct Decision
Type II Error; prob. = beta
I Decide:
Hypothesis Testing: Inferential Statistics
"Reality" There is NO Difference; Ho is true
There IS a Difference; Ho is false
There IS a difference; Reject the Ho
Type I Error; prob. = alpha "p"
Correct Decision;
There is NO difference; not reject the Ho
Correct Decision
Type II Error; prob. = beta
I Decide:
Do
HYPOTHESIS TESTING: THE MEANING OF ALPHA
•
Alpha (α): The probability of making a type I error
-
accept the research hypothesis when it is incorrect (false-positive result)
•
The probability of a type I error is usually set to 0.05
•
When a statistically significant difference is found between treatment groups at a significance level of 0.05 (P=0.05), there is a 1 in 20 probability that it was a chance finding
HYPOTHESIS TESTING: THE MEANING OF BETA
•
The probability of making a type II error = beta
•
Type II errors related to the power of the study
•
Power is the ability to detect a difference if a difference actually exists - Power = 1 – beta
•
By convention, β should be less than 0.20, and ideally less than 0.10, to minimize the chance of making a type II error
Graphical relationship of α & β errors
Sample Size Calculations •
One of the most important areas to critique when evaluating clinical studies is sample size
•
Investigators should state how the sample size was determined for the study
•
No magic number
Sample size was adequate
-
number of patients who complete the study = investigators’ initial sample size calculations presented in the methods
Statistical Methods: Sample Sizes
•
Sample size calculations • you will need: - what level of “alpha” will you accept ? - What level of power do you want? - what’s your data look like? > the standard deviation ? > A minimum difference to be able to detect
• there are tables already prepared
Statistical Methods: Sample Sizes
•
for differences in means:
•
for differences in proportions:
2
sd n= 2 αD
1 Z (α / 2 ) n= 4 d
2
Detecting effects of different sizes
Statistical Methods: Sample Sizes
•
1 Z (α / 2 ) n= 4 d
Ex:
You want to be 95% sure that the difference in cure rates between 2 drugs is at least 5% - alpha = 0.05
alpha / 2 = 0.025
- Z(0.025) = 1.96
n = 384.16 or 385 Pts.
1 1.96 n= 4 0.05
2
2
Inappropriate Sample Size Risks
Sample size too small
False-negative results (type II error) Sample may not represent population Overestimation of treatment effects
Sample size too large
Results may lack clinical significance
Descriptors of statistical significance
P VALUES •
Nothing magic about p < 0.05 • Controversies in interpretation…
•
Statistical significance does not imply that the new treatment offers a real clinical advantage
• •
Statistical significance is related to sample size Confidence intervals can help the clinician assessment of clinical significance
Confidence Interval Interpretaion: • Range of values that includes the true value for the
population parameter being measured • 95% or 99% confidence interval
-
For differences: CI should not include “0” For Hazard or Risk Ratios: • CI shouldn’t include “1”
•
Assist in making decisions concerning the clinical relevant of study data
Advantages of Confidence Interval
•
Confidence intervals remind readers of data variability
•
Often more accurately reflects purpose of study
•
Confidence intervals make clear the role of sample size
-
Reflects magnitude of difference Clinical vs. statistical significance
Overview of Statistical Tests Type of Data
Two Independent Samples
Two Related Samples (Paired/Matched)
Three or More Independent Samples
Three or More Related Samples (Paired/Matched)
Nominal
Chi square
Chi square (McNemar)
KruskaiWallis
Chi square
Ordinal
Mann-Whitney U
Sign test Wilcoxon signed ranks
KruskaiWallis
Friedman
Parametric T-test
Parametric Paired t-test
Parametric ANOVA
Parametric ANOVA
Nonparametric Mann-Whitney U
Non-parametric Wilcoxon signed ranks
Nonparametric KruskaiWallis
Non-parametric Friedman
Interval or ratio
Overview of Statistical Tests Two Independent Samples
Continuous (interval or ratio) and Nominal (binary) Parametric T-test (student T-test) Compare Means Non-parametric Mann-Whitney Compare Medians
Nominal (binary) and Nominal (binary) Chi-square test Compare frequencies/proportions
Ordinal and Nominal (binary) Mann Whitney U test Compare medians
Relative risk/ odds ratio
Two Related Samples Continuous (interval or ratio) and Nominal (binary) Parametric Paired T-test Compare Means Non-parametric Wilcoxon Signed Ranks Compare Medians
Nominal (binary) and Nominal (binary) Chi-square test (McNemar) Compare frequencies/proportions
Ordinal and Nominal (binary) Wilcoxon Signed Ranks Compare Medians
Overview of Statistical Tests Three or More Independent Samples
Continuous (interval or ratio) and Nominal (>2 categories ) Parametric Analysis of Variance (ANOVA) Compare Means
Nominal (binary) and Nominal (>2 categories) Chi-square test Compare frequencies/proportions
Non-parametric Kruskal-Wallis Compare Medians
Ordinal and Nominal (>2 categories) Kruskal-Wallis Compare medians
Three or More Related Samples
Continuous (interval or ratio) and Nominal (>2 categories) Parametric ANOVA Repeated Measures Compare Means Non-parametric Friedman Compare Medians
Nominal (binary) and Nominal (>2 categories) Chi-square test (McNemar) Compare frequencies/proportions
Ordinal and Nominal (>2 categories) Friedman Compare Medians
Data Analysis Issues
Types of Blinding Single-blind
Either subjects or investigators are unaware of assignment of subjects to active or control groups
Double-blind
Both subjects and investigators are unaware of assignment of subjects to active or control groups
Triple-blind
Both subjects and investigators are unaware of assignment of subjects to active or control groups; another group involved with interpretation of data is also unaware of subject assignment
INTENTION TO TREAT ANALYSIS •
Data for all subjects who qualify…
- Expecting some Pts will not finish…
•
Imputation of outcomes
•
PT’s last measurement (LOCF) Average score or measurement for the entire group Multivariate analysis to predict most likely outcome
Significant loss to follow up negatively affects intention to treat analysis.
PER PROTOCOL ANALYSIS • •
Only Pts who finish the protocol… Problems with the “per protocol efficacy” analysis:
-
Estimate of treatment effect likely to be flawed (i.e., overestimated), since reasons for non-adherence to protocol may be related to patients’ prognosis
-
Empirical evidence suggests that participants who adhere tend to do better than those who do not, even after adjustment for all known prognostic factors, and irrespective of assignment to treatment or placebo
-
Excluding non-adherent participants from the analysis leaves those who may be destined to have a better outcome, and destroys the unbiased comparison afforded by randomization
Confounding Variables
Methods of Coping with Confounding
•
Design Phase
- Specification - Matching
•
Analysis Phase
- Stratification - Multivariate adjustment
Statistical Methods: Linear Regression
Yi = Β0 + Β1 X 1 + Β2 X 2 + ... + Βn X n + e 20 15 10 5
Independent Variable(s): "X"
19
17
15
13
11
9
7
5
3
0 1
Dependent Variable: "Y"
25
Statistical Methods: Linear Regression
•
Key assumptions: • the “independent” variables really are! • the relationship is linear; not curved • the variables are normally distributed
•
Key output: • “Beta weights” or Parameter Estimates • Confidence intervals • R2 value - % of variation explained (higher is better)
• Sub-types: Logistic regression
Statistical Methods: Linear Regression Regression output predicting total cost for 1-year treatment with Flolan (Epoprostenol) PARTIAL OUTPUT! Variable Intercept Epoprostenol NYHA class IV Employed Disabled Retired Rales at entry Creatinine > 2.0 at entry Diabetes at entry
Parameter Estimate ($) $11,644.00 $5,159.00 $1,176.00 -$12,363.00 -$15,047.00 -$11,572.00 -$364.00 $2,813.00 $755.00
Sig.
95% CI
p < 0.10 p < 0.05 ns ns p < 0.05 p < 0.05 ns ns ns
(-392, 23680) (959, 9359) (-3434, 5786) (-28110, 3384) (-26740, -3354) (-23128, -16) (-4992, 4264) (-1499, 7125) (-4084, 5594)
2
R = 0.13 Source: Schulman et al. Results of the Economic Evaluation of the F.I.R.S.T. International Journal of Technology Assessment, 12:4 (1996), 698-713
Techniques for Economic Data
•
Economic data frequently skewed
•
Transform the data • use the log10(costs) • Then use parametric tests
•
Use ‘special’ techniques • non-parametric tests (e.g. Mann-Whitney) • ‘bootstrapping’ - taking multiple samples from the data - complex process; yields decent results
Total cost of asthma discharges
LOG (total cost)
Things to Remember • • •
The wrong test can give the wrong results! Statistical significance ≠ clinical significance Significant associations may not always be a cause-effect relationship
That’s all for today… !