Multiple Regression
Dr. Rohit Vishal Kumar Reader, Department of Marketing Xavier Institute of Social Service PO Box No 7, Purulia Road Ranchi – 834001, Jharkhand, India Email:
[email protected]
Types of Regression Models 1 E x p la n a to r y R e g r e s s io n 2 + E x p la n a to r y M o d e ls V a ria b le V a r ia b le s
M u ltip le
S im p le
L in e a r
N onL in e a r
L in e a r
N onL in e a r
Regression Modeling Steps 1. Specify the model and estimate all unknown parameters 2. Evaluate Model 3. Use Model for Prediction & Estimation
Model Specification •
Decide on the dependent variable
•
List all potential Independent variables
Linear Multiple Regression Model 1.Relationship between 1 dependent & 2 or more independent variables is a linear function
Population Y-intercept
Population slopes
Random error
Yi = β 0 + β 1X 1i + β 2 X 2i ++ β k X ki + ε i Dependent (response) variable
Independent (explanatory) variables
Linear Regression Assumptions • Mean of Distribution of Error Is 0 • Distribution of Error Has Constant Variance • Distribution of Error is Normal • Errors Are Independent
y l y l e e m m e r e t EExxtr rttaanntt r o o p IIm mp
Parameter Estimation • Step 1: – Gather Data for all the Independent and Dependent Variables
• Step 2: – Estimate the Parameters using the Least Square Method
Estimating the Parameter • Do it manually: – Requires knowledge of Matrix Manipulation of Huge Sizes – B = (X’X)-1X’Y
• Use a Software – MS Excel Can handle 15 independent Variables – No Limit on Statistical Software
Interpretation of Estimated Coefficients 1.
Slope (Β k) – Estimated average change in Y by Β k for 1 Unit Increase in Xk Holding All Other Variables Constant
–Example:
^ • If Β 1 = 0.13, then Y is expected to Increase by 0.13 for Each 1 unit increase in X1 Given X2 X3 X4… Xn are held constant
Interpretation of Estimated Coefficients • 2. Constant (B0) – The value of Y when all other Variables are = 0 – Also Know As the “Autonomous Value” of Y
Evaluating Multiple Regression Models • Examine Variation Measures • Test Significance of Overall Model, portions of overall model and Individual Coefficients • Other Things that needs to be Checked: – Check conditions of a multiple linear regression model using Residuals – Assess Multi-co linearity among independent variables
Variation Measures 1 • Coefficient of Multiple Determination • Proportion of Variation in Y ‘Explained’ by All X Variables Taken Together
Explained variation SS yy − SSE SSE R = = = 1− Total variation SS yy SS yy 2
Variation Measures 2 • Adjusted R2 • R2 Never Decreases When New X Variable Is Added to Model (Disadvantage When Comparing Models) • Solution: Adjusted R2 – Each additional variable reduces adjusted R2, unless SSE goes up enough to compensate
n − 1 SSE SSE 2 ≤ 1− =R SSyy n − ( k + 1) SS yy
2 Ra = 1 −
Testing Overall Significance 1.
Tests if there is a Linear Relationship Between All X Variables Together & Y
2.
Hypotheses –
H0: β 1 = β •
–
= ... = β k = 0
No Linear Relationship
Ha: At Least One Coefficient Is Not 0 •
3.
2
At Least One X Variable linearly Affects Y
Uses F test statistic 2 ( S S − S S E ) / k S S R/( k) yy F= = S S E/( −n k − 1)S S E/ −n( k) + 1 ( 1 − R 2 ) / n ( H0
~ Fk , n−
k −1
R/ k)− 1