CORRELATIONAL RESEARCH NAJAH NADIAH BT. ALIAS M20082000089
RELATIONSHIP STUDIES
CORRELATIONAL RESEARCH
PREDICTION STUDIES
Correlational Research
The definition for correlational research is determination whether and to what degree variables are related.
The purpose of correlational research: i) determine relationships between variables. ii) to use the relationship to make predictions. Variable found not to be highly related to achievement will be dropped for further examination while variables that highly related to achievement may be examined in causal-comparative or experimental studies to determine the nature relationship. iii) to determine whether and how a set of variables are related.
Definition of Correlational Research
But, there is a limitation for this research, it cannot indicate cause and effect for variables.
The degree of relationship is expressed as a correlation coefficient.
If a relationship exist between two variables, it means that scores within a certain range on one variable are associated with scores within a certain range on the other variable. Eg: there is a relationship between intelligence & academic achievement; persons who scores highly on intelligence tends to have high grade point averages.
The Major Steps Involved In The Basic Correlational Research Process Problem selection
Participant & instrument selection
Design procedures
Data analysis & interpretation
1. Problem selection •
Variables to be correlated are selected on the basis of some rationale.
•
the relationship to be investigated should be a logical one, suggested by theory or derived from experienced.
•
having theoretical or experimental basis to interpret the meaningful results.
•
Avoid shotgun or fishing approach, both very inefficient and difficult to interpret.
2. Participant & instrument •
Minimal sample size for correlational study is 30 participant.
•
There are some factors that can influenced the size of sample.
•
The sample can be smaller ( but not less than 30) if the variables correlate with higher validity & reliability.
•
If validity & reliability are low, a larger sample is needed.
3. Design & Procedure •
Collect data on two or more variables for each subject.
•
Two or more scores are obtained for each member of the sample.
•
The paired scores are then correlated.
•
The result is expressed as a correlation coefficient.
4. Data Analysis and Interpretation •
A correlation coefficient indicates size and direction of relationship.
•
It is a decimal number ranging from -1.00 to 0.00 to +1.00
•
A coefficient near +1.00 has high size and positive direction -eg: a person with high score on one variable is likely to have a high score on the other variable. An increase on one variable is associated with an increase on the other variable
4. Data Analysis and Interpretation •
A coefficient near 0.00, the variables are not related.
•
A coefficient near -1.00 has a high size & negative direction - high score on one variable is likely to have a low score on the other variable.
•
Correlations near +1.00 and near -1.00 represent the same size of relationship.
•
The (+) and (-) represent different directions of relationship
Correlation Coefficient
-1.00 strong negative
0.00
+1.00
strong positive no relationship
A Positive Correlation
y
x
A Negative Correlation
y
x
No Correlation
y
x
No Correlation
y
x
How to interpret correlation coefficient? i.
Coefficient below ±0.35, low or not related. Coefficient between ±0.35 and±0.65 moderately related. Coefficient higher than ±0.65 highly related. Coefficient much below ±0.50 is generally useless. Coefficient of ±0.60 or ±0.70 are usually considered adequate for group prediction purpose. Coefficient of ± 0.80 and above are adequate for individual prediction purpose.
Examples of Scatter Plots r=.90
r=.65
r= -.90 r= -.75
r=.35
r= -.50
r=.00
r= -.10
Common Variance
The area to which variables vary in a systematic manner. Interpreted as the percentage of variance in the criterion variable explained by the predictor variable The squared correlation coefficient indicates the amount of common or shared variation between the variables. The higher the shared variation is, the higher the correlation. Eg: a correlation coefficient of ±0.80 indicates (0.80)2 or 64% of the variance in the criterion can be explained by the predictor
Statistical Significance?
Statistical significance refers to: -whether the obtained coefficient is really different from zero - and reflects a true relationship, not an accidentally relationship. Statistical significance depends on the sample size. Small samples require higher correlations for significance Large samples require lower correlations for significance
Circle Diagrams Illustrating Relationships Among Variables
RELATIONSHIP STUDIES Purpose of relationship studies:
Suggest subsequent examination using causal-comparative and experimental studies to determine whether there is causal connection between the variables.
Control for variables related to the dependent variable in experimental studies. Step involved in conducting relationship study: Data collection
Data analysis & interpretation
Data Collection
Identifies the variables to be correlated. Eg: If you were interested in factors related to self-concept, you might identify variables such as academic achievement & socio economic status.
Avoid the ‘shotgun’ approach Possibility of erroneous relationships Issues related to determining statistical significance.
The population must be one for which data on each of the identified variables can be collected.
Compute the appropriate correlation coefficient.
Data Analysis and Interpretation
There are many types of correlation, distinguished mainly by the type of data that they are being correlated.
The most commonly used correlation is the product moment correlation coefficient, Pearson r- used when both variables are continuous (ratio or interval data).
The Spearman rho correlation is used when ordinal data (ranks) are being correlated.
continue…
Although the Pearson r is more precise, but Spearman rho is much easier to compute with a small number of participants (less than 30).
Phi coefficient use for variable that can only be expressed in terms of a categorical dichotomy such as gender (labelled 1-male; 2-female).
If a relationship is suspected of being curvilinear, then an eta correlationship is appropriate.
Types of Correlation Coefficients Coefficient
Variable
Variable
Pearson
interval
Spearman
rank-ordered
Biserial
artificial dichotomy True dichotomy
continuous
artificial dichotomy true dichotomy
artificial dichotomy true dichotomy
Point biserial Tetrachoric Phi
continuous (e.g., scores, ages) rank-ordered
continuous
Linear and Curvilinear Relationships
If a relationship is linear, then, plotting the scores of the two variables will result in a straight line.
If a relationship is curvilinear, an increase in one variable is associated with corresponding increase in another variable up to a point, then, further increase in the first variable result in corresponding decreases in other variable.
Linear relationship
Curvilinear relationship
Factors That Influence Correlations 1.
Sample size: The larger the sample size, more valid the correlation. Subgroup: -The relationship between females & males may be different. - When subgroups are lumped together and correlated, differential relationships may be obscured. - It will reduce the size of samples (if you want to study subgroup, select larger sample and use stratified samples to ensure similar numbers in the subgroup)
Factors That Influence Correlations 2.
Variation The greater the variation in scores tend to give a strong correlation and vise versa.
3.
Attenuation Correlation coefficients are lower when the instruments being used have low reliability A correction for attenuation is available, but should not be used in prediction studies since predictions must be made based on existing measures.
PREDICTION STUDIES
If two variables are highly related, scores on one variables can be used to predict scores on the other variable. The purpose of prediction studies: Prediction studies are often conducted to facilitate decision about individuals, or to aid in the selection of individuals. Also conducted to test variables believed to be good predictor. To determine the predictive validity of measuring
continue…
The variables used to predict is called predictor.
The variables that is predicted is called criterion.
More than one predictor variable can be used to make predictions.
If several predictor variables each correlate well with a criterion, then a prediction based on a combination of those variables will be more accurate.
Data Analysis and Interpretation
There are two types of prediction studies: - single predictor variable studies - multiple predictor studies
Single prediction studies use one predictor and one criterion. Y = a + bX Where Y = predictor criterion X = an individual’s score on the predictor variable a = a constant calculated from the scores of all participants b = the coefficient indicating the contribution of the predictor to the criterion
Data Analysis and Interpretation
Multiple prediction studies use multiple predictors Y = a + bX1 + bX2 + … + bXn Where Y = predictor criterion Xn = an individual’s score on the predictor variable a = a constant calculated from the scores of all participants b = the coefficient indicating the contribution of the predictor to the criterion
Data Analysis and Interpretation
Prediction studies often result as a multiple regression equation.
A multiple regression equation uses all variables that individually predict the criterion to make a more accurate prediction.
Data Analysis and Interpretation
Factors that influence predictive studies: - predictor & criterion variable are not reliable - the longer the length of time between the measurement of the predictor & criterion, the lower the production accuracy is.
Predicted scores should be interpreted as intervals, not as single number.
The Major Difference Between Data Collection Procedures in Prediction Study and Relationship Study
Relationship Study
Prediction Study
Relationship studies develop insight into the relationships between several variables.
Predictive studies involve the predictive relationships between or among variables.
Data on all variables are collected with a relatively short period of time.
Variables are measured some period of time before the criterion variable is measured.
Other Correlation Analyses 1. Discriminant Analysis Quite similar to multiple regression but the criterion variable is categorical, not continuous. Typically used to predict group membership. 2. Path Analysis Studies relationships and patterns among a number of variables, yielding a diagram showing the direct or indirect relationships between the variables
Other Correlation Analyses 3. Structural Equation Modeling A sophisticated form of path analysis providing greater theoretical validity and statistical precision clarifying the direct or indirect interrelationships among variables relative to a given variable 4. Canonical Correlation An extension of multiple regression analysis that produces a correlation based on a group of predictor variables and a group of criterion variables
Other Correlation Analyses 5. Factor Analysis A statistical method for making sense of a large number of variables. Approach: group a larger number of variables into a smaller number of clusters; derive factors by finding groups of variables that are highly among each other, but lowly with other variables; use factors as variables.