Gathering Useful Data
Principle Idea: The knowledge of how the data were generated is one of the key ingredients for translating data intelligently.
2
Description or Decision? Using Data Wisely • Descriptive Statistics: using numerical and graphical summaries to characterize a data set. • Inferential Statistics: using sample information to make conclusions about a broader range of individuals than just those observed. 3
The Fundamental Rule for Using Data for Inference Available data can be used to make inferences about a much larger group if the data can be considered to be representative with regard to the question(s) of interest. 4
Example of Representative Sample Do First Ladies Represent Other Women? Past First Ladies are not likely to be representative of other American women, nor even future First Ladies, on the question of age at death, since medical, social, and political conditions keep changing in ways that may affect their health. 5
Example of Representative Sample Do Penn State Students Represent Other College Students? • If question of interest = average handspan of females in college age range => Yes • If question of interest = how fast ever driven a car => No, since Penn State in rural area with open spaces, county roads, little traffic. 6
Populations, Samples, and Simple Random Samples Population: the larger group of units about which inferences are to be made. Sample: the smaller group of units actually measured. Simple Random Sample: every conceivable group of units of the required size from the population has the same chance to be the selected sample. Helps ensure sample data will be representative of the population, but can be difficult to obtain. 7
Types of Research Studies: Observational v Experimental
Observational Study Researchers observe or question participants about opinions, behaviors, or outcomes. Participants not asked to do anything differently. Two special cases: sample surveys and case-control studies. 8
Experiment: Researchers manipulate something and measure the effect of the manipulation on some outcome of interest. Randomized experiments: participants are randomly assigned to participate in one condition (called treatment) or another. Sometimes cannot conduct experiment due to practical/ethical issues. 9
Who is Measured: Units, Subjects, Participants Unit: a single individual or object being measured. If an experiment, then called an experimental unit. When units are people, often called subjects or participants. 10
Roles Played by Variables – Measured or Not Explanatory variable (or independent variable) is one that may explain or may cause differences in a response variable (or outcome or dependent variable). A confounding variable is a variable that affects the response variable and also is related to the explanatory variable. A potential confounding variable not measured in the study is called a lurking variable. 11
Example: Confounding Variables Lurk behind Lower Blood Pressure? People who attended church regularly had lower blood pressure than those who stayed home. Possible confounding variables: Amount of social support Health status Age Attitude toward life 12
Designing a Good Experiment Randomized experiments: often allow us to determine cause and effect. Random assignment: to make the groups approximately equal in all respects except for the explanatory variable. 13
Who Participates in Randomized Experiments? Participants in randomized experiments are often volunteers. Remember Fundamental Rule: Available data can be used to make inferences about a much larger group if the data can be considered to be representative with regard to the question(s) of interest. 14
Randomization: The Crucial Element Randomizing the Type of Treatment: Randomly assigning the treatments to the experimental units keeps the researchers from making assignments favorable to their hypotheses and also helps protect against hidden or unknown biases.
Randomizing the Order of Treatments: If all treatments are applied to each unit, randomization should be used to determine the order in which they are applied. 15
Case Study:
Kids and Weight Lifting
Is weight training good for children? If so, is it better to lift heavy weights for few repetitions or moderate weights more times? Randomized Experiment involving 43 young volunteers. Three groups: 1 = heavy load 2 = moderate load 3 = control group “Leg extension strength significantly increased in both exercise groups compared with that in the control subjects.” Faigenbaum et al., 1999, p. e5 16
Control Groups, Placebos, and Blinding Control Groups: Treated identically in all respects except they don’t receive the active treatment. Sometimes they receive a dummy treatment or a standard/existing treatment.
Placebo:
Looks like real drug but has no active ingredient. Placebo effect = people respond to placebos.
Blinding:
Single-blind = participants do not know which treatment they have received. Double-blind = neither participant nor researcher making measurements knows who had which treatment.
Double Dummy:
Each group given two “treatments”… Group 1 = real treatment 1 and placebo treatment 2 Group 2 = placebo treatment 1 and real treatment 2 17
Designing a Good Observational Study Disadvantage: more difficult to try to establish causal links. Advantage: more likely to measure participants in their natural setting. 18
Types of Observational Studies Retrospective:
Participants are asked to recall past events.
Prospective:
Participants are followed into the future and events are recorded.
Case-Control Studies:
“Cases” who have a particular attribute or condition are compared to “controls” who do not to see how they differ on an explanatory variable of interest. Advantages: Efficiency and Reduction of Potential Confounding Variables through careful choice of “controls”. 19
Difficulties and Disasters in Experiments and Observational Studies Confounding Variables and the Implication of Causation in Observational Studies Big misinterpretation = reporting cause-and-effect relationship based on an observational study. No way to separate the role of confounding variables from the role of explanatory variables in producing the outcome variable if randomization is not used.
Extending Results Inappropriately Many studies use convenience samples or volunteers. Need to assess if the results can be extended to any larger group for the question of interest. 20
Difficulties and Disasters in Experiments and Observational Studies Interacting Variables A second variable can interact with the explanatory variable in its relationship with the outcome variable. Results should be reported taking the interaction into account.
Example: Interaction in Case Study 3.3 The difference between the nicotine and placebo patches is greater when there are no smokers in the home than when there are smokers in the home. 21
Difficulties and Disasters in Experiments and Observational Studies Hawthorne and Experimenter Bias Hawthorne effect = participants in an experiment respond differently than they otherwise would, just because they are in the experiment. Many treatments have higher success rate in clinical trials than in actual practice. Experimenter effects = recording data to match desired outcome, treating subjects differently, etc. Most overcome by blinding and control 22 groups.
Difficulties and Disasters in Experiments and Observational Studies Ecological Validity and Generalizability When variables have been removed from their natural setting and are measured in the laboratory or in some other artificial setting, the results may not reflect the impact of the variable in the real world.
23
Difficulties and Disasters in Experiments and Observational Studies Relying on Memory or Secondhand Sources • • •
Can be a problem in retrospective observational studies. Try to use authoritative sources such as medical records rather than rely on memory. If possible, use prospective observational studies.
24