Term Paper

  • Uploaded by: tryon gabriel
  • 0
  • 0
  • May 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Term Paper as PDF for free.

More details

  • Words: 1,529
  • Pages: 11
ALTERNATIVE METHOD OF COMPUTING CORRELATION COEFFICIENT USING THE COMPUTATIONAL VERSION OF THE PEARSON PRODUCT-MOMENT CORRELATION COEFFICIENT FORMULA

A Term Paper Presented to: Dr. Lucila Fineza-Tibigar (Professor)

In Partial Fulfillment of the Requirements of the Course Statistics Applied to Educational Research II (EdAd 600)

Tryon R. Gabriel April, 2005

Background of the Study The Pearson Product Moment Correlation Coefficient is the most widely used measure of correlation or association. It is named after Karl Pearson who developed the correlational method to do agricultural research. The product moment part of the name comes from the way in which it is calculated, by summing up the products of the deviations of the scores from the mean. The symbol for the correlation coefficient is lower case r, and it is described in textbooks as the sum of the product of the Z-scores for the two variables divided by the number of scores.

If we substitute the formulas for the Z-scores into this formula we get the following formula for the Pearson Product Moment Correlation Coefficient, which we will use as a definitional formula.

The numerator of this formula says that we sum up the products of the deviations of a subject's X score from the mean of the X’s and the deviation of the subject's Y score from the mean of the Y’s. This summation of the product of the deviation scores is divided by the number of subjects times the standard deviation of the X variable times the standard deviation of the Y variable.

You can see that it is fairly difficult to calculate the correlation coefficient using the definitional formula. In real practice we use another formula that is mathematically identical but is much easier to use. This is the computational or raw score formula for the correlation coefficient. The computational formula for the Pearsonian r is

To properly interpret the correlation coefficient, one must understand the basic properties of r: 

The value r measures the strength of the linear relationship between X and Y and will always be between -1 and +1.



The closer r is to either -1 or +1, the stronger the linear relationship between X and Y. In fact, points that fall exactly on a straight line have a correlation of +1 if the line has positive slope and -1 if the line has negative slope.



If r is zero, then X and Y are not linearly related. They may be related, but the relationship is not a straight line.



The value of r does not change when the units of measurement are change. It is still computationally difficult to find the correlation coefficient,

especially if we are dealing with a large number of subjects. In practice we would probably use a computer to calculate the correlation coefficient. The aim of this paper is to present a modified method of computing correlation coefficient using the computational version of the Pearson Product-Moment Correlation Coefficient formula. As mentioned above, it is possible that the data obtained for each variable are too large to handle for manual computation. In the absence of the computer, such difficulty could lead

to computational error giving results that greatly affect the decision making. In this paper, the author presents a method of reducing the said difficulty by subtracting from the values of the variable its corresponding assumed mean.

Statement of the Problem The purpose of this paper is to present and determine the validity of an alternative method of computing correlation coefficient using the computational version of the Pearson Product-Moment Correlation Coefficient formula. Specifically, this paper sought to answer the question: Is there a difference in the result of the computation of correlation coefficient when an assumed mean for a given variable is subtracted from its values?

Procedure To determine the validity of the said alternative method, the author presented all the possible cases where the assumed mean for a given variable (say, X or Y) is subtracted from its values. The said cases are the following: (i) assumed mean subtracted from the values of X alone; (ii) assumed mean subtracted from the values of Y alone; and (iii) corresponding assumed means for X and Y subtracted from their values. For each case, correlation coefficient is computed using the computational version of the Pearson Product-Moment Correlation Coefficient formula.

Findings The following is the result of the usual method of computing the correlation coefficient between the variables X and Y using the computational version of the Pearson Product-Moment Correlation Coefficient Formula.

X 26 42 37 82 66 44 24 39 55 61 77 58

Y 37 90 48 90 88 100 95 120 95 76 89 100 Σ= r=

X2 676 1764 1369 6724 4356 1936 576 1521 3025 3721 5929 3364 34961

Y2 1369 8100 2304 8100 7744 10000 9025 14400 9025 5776 7921 10000 93764 0.264201335

XY 962 3780 1776 7380 5808 4400 2280 4680 5225 4636 6853 5800 53580

The above shows that the correlation coefficient r = 0.264201335 and the values obtained are very large and difficult to handle for manual computation. The above table is presented by the author of this paper for the purpose of comparing it to the following data obtained for the above-mentioned cases:

X -14 2 -3 42 26 4 -16 -1 15 21 37 18

Y 37 90 48 90 88 100 95 120 95 76 89 100 Σ= r=

X2 196 4 9 1764 676 16 256 1 225 441 1369 324 5281

Y2 1369 8100 2304 8100 7744 10000 9025 14400 9025 5776 7921 10000 93764 0.264201335

1. Assumed mean subtracted from the values of X alone:

XY -518 180 -144 3780 2288 400 -1520 -120 1425 1596 3293 1800 12460

The above result shows that after subtracting the assumed mean (=40) for the values of X it still yields the same correlation coefficient. Notice also that the values for X and X 2 become smaller compared to their original values shown in the first table and easier to handle for manual computation. 2. Assumed mean subtracted from the values of Y alone:

X 26 42 37 82 66 44 24 39 55 61 77 58

Y -43 10 -32 10 8 20 15 40 15 -4 9 20 Σ= r=

X2 676 1764 1369 6724 4356 1936 576 1521 3025 3721 5929 3364 34961

Y2 1849 100 1024 100 64 400 225 1600 225 16 81 400 6084 0.264201335

XY -1118 420 -1184 820 528 880 360 1560 825 -244 693 1160 4700

The above result shows that after subtracting the assumed mean (=80) for the values of Y it still yields the same correlation coefficient. Notice also that the values for Y and Y2

become smaller compared to their original values shown in the first table and easier to handle for manual computation. 3. Corresponding assumed means for X and Y subtracted from their values:

X -14 2 -3 42 26 4 -16 -1 15 21 37 18

Y -43 10 -32 10 8 20 15 40 15 -4 9 20 Σ= r=

X2 196 4 9 1764 676 16 256 1 225 441 1369 324 5281

Y2 1849 100 1024 100 64 400 225 1600 225 16 81 400 6084 0.264201335

XY 602 20 96 420 208 80 -240 -40 225 -84 333 360 1980

The above result shows that after subtracting the corresponding assumed means for the values of X and Y it still yields the same correlation coefficient. Notice also that the values for X, X2, Y, and Y2 become smaller compared to their original values shown in the first table and again they are now easier to handle for manual computation.

Conclusion On the basis of the above results, the author of this paper inferred that subtracting the assumed mean from the values of the variables X and Y doesn’t alter the result of the computation of the correlation coefficient using the computational version of the Pearson Product-Moment Correlation Coefficient formula.

Recommendation 1. In view of the above satisfactory result, the author of this paper recommends the method of subtracting the assumed mean from the values of the variable in the computation of the correlation coefficient using the computational version of the Pearson Product-Moment Correlation Coefficient formula. It also greatly reduces the magnitude of the numbers involved making them easier to handle for manual computation. 2. If the assumed mean doesn’t sufficiently reduce the size of the numbers, the author also recommends dividing the said numbers by a multiple of ten before performing the computation of the correlation

coefficient using the computational version of the Pearson ProductMoment Correlation Coefficient formula.

Reference Kitchens, L. J. (1998). Exploring Statistics, A Modern Introduction to Data Analysis and Inference, 2nd ed. Ca. 93950: Brooks/Cole Publishing Co. Bernstein, S. & Bernstein, R. (1999). Schaum’s Outline of Theory and Problems of Elements of Statistics I: Descriptive Statistics and Probability, International ed. Singapore: McGraw-Hill Book Co. Bernstein, S. & Bernstein, R. (1999). Schaum’s Outline of Theory and Problems of Elements of Statistics II: Inferential Statistics, International ed. Singapore: McGraw-Hill Book Co. Dougherty, E. R. (1990). Probability and Statistics for the Engineering, Computing, and Physical Sciences. New Jersey 07632: Prentice-Hall, Inc.

Related Documents

Term Paper
June 2020 10
Term Paper
August 2019 52
Term Paper
May 2020 19
Term Paper
April 2020 17
Term Paper
May 2020 20
Term Paper
December 2019 28

More Documents from ""