S I S Y L A N A N O I S S E R G RE
INTRODUCTION The term regression was originally introduced in statistics by Sir Francis Galton in 1877 in his research paper ‘Regression towards Mediocrity in Hereditiary stucture’. He reached at the conclusion that ü Tall fathers had tall sons and short fathers had short statured sons. ü The mean height of the sons of a group of tall fathers was found to be less than that of the fathers and the mean height of the sons of a group of short statured fathers was found to be greater than that of the fathers.
Definition :
Regression is the measure of the average relationship between two or more variables in terms of original units of data.
Utility :
Regression analysis is a statistical method which is used in those fields where we find the tendency of going back towards the general average in two of more correlated series. In the field of economics and business , regression analysis has more utility. Regression analysis is used as control tools by management in their business. This helps in taking decisions in business. Regression analysis can be used in other fields also like natural , physical and social sciences. The best estimate can only be had if the series are correlated. The analysis can also be extended in more than two series also.
Some functional highlights
It help us to estimate the dependent variables with the help the independent variables.
It
helps us to measure the error involved in using the regression lines as the basis for estimations.
We can obtain a measure of degree associations or correlation that exist between the two variables i.e. dependant variables and independent variables.
Types of Regression Regression
Simple Regression
Simple Regression
Dependant variable
Independent variable
Regression lines The lines of best fit drawn to show the mutual relationship between X and Y variables are known as Regression Lines. For two variables we have two regression lines, one representing regression of X on Y and other Y on X. The line representing regression of X on Y presumes Y as an independent variable and X as a dependent variable. The lines gives the best estimated value of X for the given value of Y. In the same way, the second line represents the regression of Y on X.
Functions Of Regression Lines I. Best Estimate
I. Extent and Direction of Correlation
Positive
Correlation Negative Correlation Perfect Correlation Absence of Correlation Limited Correlation
Regression Equations Regression Equations are algebraic form of regression lines. They are also known as estimating equations. As there are two regression lines, we have two regression equations.
Regression Equation of X on Y : This is used to describe variation in the value of X for the given changes in Y.
Regression Equation of Y on X : This is used to describe variation in the value of Y for the given changes in X.
Regression Equation Of X on Y The regression equation X on Y is written in the form of X= a+bY. From this equation we can have the best estimate of X for the given value of Y. In this way from the estimated values of X and known values of Y, we draw a line which is known as regression line X on Y. To determine the values of a and b the following two normal equations are to be solved simultaneously.
∑X = Na +b∑Y ∑XY =a ∑Y +b∑Y
2
Regression Equation of Y on X The regression equation of Y on X is written in the form of Y= a+bX. From this equation we can have the best estimate of Y for the given value of X. In this way from the estimated values of Y and known values of X, we draw a line which is known as regression line Y on X To determine the values of a and b the following two normal equations are to be solved simultaneously.
∑Y =Na +b∑X ∑XY =a ∑X +b∑X
2
Example
Calculate the regression equations of X on Y and Y on X from the following data :
X
Y
2 3
X
2 3 4 5 6
3 4 Y
3 4 5 6 7 ∑ X = 20
4 5 Y2
4 9 16 25 36 ∑ Y = 25
5 6
∑X
6 7
XY
9 16 25 36 49 2
= 86
∑Y
XY
6 12 20 30 42 2
= 135
∑ XY = 135
Calculations Based Of Arithmetic Mean
gression equation of X on Y : ( X − X ) =r
Regression Equation of Yon X :
σX (Y −Y ) σy
(Y − Y ) =r
σ σy
σy σx
( X −X )
r X = is called Regression coefficient of Y
denotes the actual mean of X = Its value is in the same Xseries and ∑ xy = denotes the actual mean of Y manner as proved in case of the regression equation X ∑ x2 Yseries and coefficient of correlation between X and Y series. = standard deviation of X series σx = standard deviation of Y series
σy bXY
= r
σ x σ y
X Y xy σ ∑ ∑ = × x = 2 nσ nσ σ nσ x × y y y
xy xy ∑ ∑ = = y2 y2 ∑ ∑ n× n
X
3
5
Y
6
7
Example 7
9
11
9
8
10
Solution :
X 3 5 7 9 11 ∑ X = 35 Mean=7
(X − X ) ( x)
-4 -2 0 +2 +4
Y
2
16 4 0 4 16
∑ x = 0 ∑x
Y 6 7 9 8 10
2
= 40
∑Y =40 Mean=8
(Y − Y ) ( y)
-2 -1 +1 0 +2
Y2
4 1 1 0 4
∑y=0 ∑y
Xy +8 +2 0 0 +8
2
= 10
∑ xy= 18
Regression equation of X on Y :
σx ( X − X ) = r (Y − Y ) σy xy ∑ or, ( X − X ) = (Y − Y ) ∑y 2
18 X − 6 = (Y − 8) 10 ⇒ X − 6 = 1.8(Y − 8) ⇒ X = 1.8Y − 8.4
Regression equation of Y on X :
σy (Y − Y ) = r (X − X ) σx
xy ∑ or , (Y − Y ) = (X − X ) ∑x 2
18 Y −8 = ( X − 6) 40 ⇒ Y − 8 = 0.45( X − 6) ⇒ Y − 8 = 0.45 X − 2.70 Y = 0.45 X + 2.70
Calcualtions Based of Assumed Mean Regression equation of Y on X :
b yx
N∑ xy − x y ∑∑ = ( X −X ) 2 2 N∑ y −( ∑ x )
Regression equation of X on Y :
bxy
N∑ xy − x y ∑∑ = (Y − Y) 2 2 N∑ y −( ∑ y)
Example Height of the fathers in inches
62
64
66
67
68
68
69
71
72
73
Height of the sons in inches
63
62
65
67
67
70
70
67
68
71
Solution : X
( X - 65 )= x
Column1
Y
( Y - 65 )= y
Column6
xy
62
-3
9
63
-2
4
6
64
-1
1
62
-3
9
3
66
1
1
65
0
0
0
67
2
4
67
2
4
4
68
3
9
67
2
4
6
68
3
9
70
5
25
15
69
4
16
70
5
25
20
71
6
36
67
2
4
12
72
7
49
68
3
9
21
73
8
64
71
6
36
48
30
198
20
120
135
x ∑ = 65
X =A +
30 + 10
= 65 + 3= 68 N y ∑ Y =A + = 65 + 2 = 67 N
Regression equation of X on Y: (X −X ) =r
bxy =
σx (Y −Y ) σy
N ∑ xy −∑ x ∑ y N ∑ y − (∑ y ) 2
2
(Y − Y )
30 * 20 ) 10 ( X − 68) = (Y − 67) 2 (20) 120 − 10 75 ( X − 68) = (Y − 67) 80 X = 68 − 62.8 + 0.94Y ∴ X = 5.2 + .94Y 135 − (
Regression equation of Y on X: (Y − Y) = r
byx =
σ y (X σ x
N ∑xy −∑x ∑y N ∑y −(∑x ) 2
2
− X)
( X −X )
30 * 20 ) 10 (Y − 67) = ( X − 68) (30) 2 198 − 10 75 (Y − 67) = ( X − 68) 108 Y = 67 − 47.6 + 0.7 X 135 − (
∴Y = 19.4 + 0.7 X
Regression Equations in Grouped Frequency Distribution
Regression Equations of X on Y:
(Y − Y) = b yx ( X − X)
fxy ∑ (Y − Y) =
fx * fy ) − (∑ ∑ iy N * (X − X) 2 ( f x ) i x fx 2 − ∑ ∑ N
Regression Equations of Y on X :
(Y −Y ) =b yx ( X −X )
∑fxy
(Y −Y ) =
fx * ∑fy ) ∑ −(
∑fx
2
N (∑fx 2 ) − N
*
iy ix
( X −X )
Example Height in inches
Weight in lbs.
80-90
90-100
100-110
110-120
Total
50-55
2
10
8
-
20
55-60
4
15
5
1
25
60-65
2
10
15
8
35
65-70
2
5
2
11
20
Total
10
40
30
11
100
X
fx ∑ =A +
*i
N −45 * 5 X =62 .5 + 100 X =60 .25 Regression Equation of X on Y:
Y
fy ∑ =A +
N 60 Y =95 + *10 100 Y =101
Regression Equation of Y on X:
X − X = bxy (Y − Y )
Y = a + bX
( X − 60.25) = 0.202(Y − 101)
(Y − Y ) = byx ( X − X )
( X − 60.25) = 0.202Y − 0.202 *101 X = 60.25 − 20.402 + 0.202Y ∴ X = 39.848 + 0.202Y
*i
(Y −101) = 0.649 ( X − 60 .25 ) (Y −101) = 0.649 X − 0.649 * 60 .25 Y = 101 − 39 .102 + 0.649 X ∴Y = 61 .898 + 0.649 X
Regression Coefficients
Regression Coefficients of X on Y :
bxy
σx =r σy x*∑ y ∑ ) ∑ xy − (
b yx
xy ∑ = y ∑
xy * N − (∑ xy ) ∑ N bxy = or 2 2 2 ( y ) y * N − ( y ) ∑ ∑ ∑ ∑ y2 − N x*∑ y ∑ ∑ xy − ( N ) ix bxy = * 2 (∑ y ) iy 2 y − ∑ N
2
Regression Coefficients of Y on X :
bxy
σy =r σx x*∑ y ∑ ) ∑ xy − (
b yx
xy ∑ = x ∑
2
xy * N − (∑ xy ) ∑ N byx = or 2 2 2 ( x ) x * N − ( y ) ∑ ∑ ∑ ∑ x2 − N x*∑ y ∑ ∑ xy − ( N ) i y bxy = * 2 (∑ x ) ix 2 x − ∑ N
Example : Find the regression coefficients : X
1
2
3
4
5
Y
2
5
3
8
7
Solution :
X
x=(X-2)
1 2 3 4 5
-1 0 +1 +2 +3 +5
x 1 0 1 4 9 15
2
Y
y=(Y-4)
2 5 3 8 7
-2 +1 -1 +4 +3 +5
y2
4 1 1 16 9 31
xy 2 0 -1 8 9 18
Means of X and Y :
y 5 ∑ Y = A+ = 4 + = 4 +1 = 5
x 5 ∑ X = A+ = 2+ = 2+1= 3 N
N
5
Regression Coefficients of X on Y : byx = ∴bxy
∑xy −(
∑x * ∑y )
N (∑y ) 2 2 ∑y − N = +0.5
5*5 ) 13 5 = = = 0 .5 (5) 2 26 31 − 5 18 −(
Regression Coefficients of Y on X : x * ∑y 5*5 ∑ xy − ( ) 18 − ( ) ∑ 13 N 5 byx = = = =1.3 2 2 (∑x ) (5) 10 2 15 − x − ∑ 5 N ∴bxy = +1.3
5
h T
k n a
u o Y