Set-2 2. What is the different between correlation and regression? What do you understand by Rank Correlation? When we use rank correlation and when we use Pearsonian Correlation Coefficient? Fit a linear regression line in the following data X 12 15 18 20 27 34 28 48 Y 123 150 158 170 180 184 176 130 ANSWERE: Difference between Correlation and Regression •
Correlation Coefficient, R, measures the strength of bivariate association
• •
The regression line is a prediction equation that estimates the values of y for any given x
correlation (often measured as a correlation coefficient) indicates the strength and direction of a linear relationship between two random variables. Correlation is a measure of association between two variables. The variables are not designated as dependent or independent. Correlation When two social, physical, or biological phenomena increase or decrease proportionately and simultaneously because of identical external factors, the phenomena are correlated positively; under the same conditions, if one increases in the same proportion that the other decreases, the two phenomena are negatively correlated. Investigators calculate the degree of correlation by applying a coefficient of correlation to data concerning the two phenomena. The most common correlation coefficient is expressed as
in which x is the deviation of one variable from its mean, y is the deviation of the other variable from its mean, and N is the total number of cases in the series. A perfect positive correlation between the two variables results in a coefficient of +1, a perfect negative correlation in a coefficient of -1, and a total absence of correlation in a coefficient of 0. Intermediate values between +1 and 0 or -1 are interpreted by degree of correlation. Thus, .89 indicates high positive correlation, -.76 high negative correlation, and .13 low positive correlation. Simple regression is used to examine the relationship between one dependent and one independent variable. After performing an analysis, the regression statistics can be used to predict the dependent variable when the independent variable is known. Regression goes beyond correlation by adding prediction capabilities. In statistics, Spearman's rank correlation coefficient or Spearman's rho, named after Charles Spearman and often denoted by the Greek letter ρ (rho) or as rs, is a non-parametric measure of correlation – that is, it assesses how well an arbitrary monotonic function could describe the relationship between two variables, without making any other assumptions about the particular nature of the relationship between the variables. Certain other measures of correlation are parametric in the sense of being based on possible relationships of a parameterised form, such as a linear relationship The Spearman rank correlation coefficient can be used to give an Restimate, and is a measure of monotone association that is used when the distribution of the data make Pearson's correlation coefficient undesirable or misleading. The Spearman rank correlation coefficient is defined by
where is the difference in statistical rank of corresponding variables, and is an approximation to the exact correlation coefficient
computed from the original data. Because it uses ranks, the Spearman rank correlation coefficient is much easier to compute.