Cautions About Correlation And Regression

  • November 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Cautions About Correlation And Regression as PDF for free.

More details

  • Words: 364
  • Pages: 2
Cautions about Correlation and Regression | SHUBLEKA    

™ The square of correlation is the fraction of variation in y values that is explained by the least squares regression of y on x. ™ Data transformation = applying functions such as the logarithm can simplify statistical analysis



Residual = Observed – Predicted = y − y

Geometrically: distance from each point to the least squares regression line. Î Examining residuals helps assess how well the line describes the data Î Special property: the mean of the least-squares residuals is always zero. Î Residual plot = scatterplot of the regression residuals against the explanatory variable Î Use residual plots to assess the fit of a regression line Î If the regression line captures the overall pattern of the data, there should be no pattern in the residuals Î Look for striking individual points as well as for an overall pattern

Outliers: ¾ Outlier = a point that lies outside the overall pattern ¾ In the x-direction can have a strong influence on the position of the regression line ¾ In the y-direction have large residuals Influential points: ¾ A point is influential if removing it significantly changes the regression line. Outliers in the x direction are often influential points. ¾ Demonstration: Correlation and Regression Applet Cautions: ¾ Correlation measures only linear association, and fitting a straight line makes sense only when the overall pattern is linear. Always plot the data before calculating. ¾ Extrapolation often produces unreliable predictions ¾ Correlation and Least Squares Regression are not resistant. Always plot the data and look for potentially influential points.

 

Cautions about Correlation and Regression | SHUBLEKA    

¾ Lurking variable = a variable that is not among the explanatory or response variables in a study and yet may influence the interpretation of the relationships among those variables ¾ Association does not imply causation ¾ A correlation based on averages is usually higher than if we used data for individuals ¾ A correlation based on data with restricted range problem is often lower than would be the case if could observe the full range of the variables ¾ Demonstration: TI83/84 residual plot (L3= Y1(L1), L4 = L2 – L3)

 

Related Documents