《多元线性回归模型:估计》由会员分享,可在线阅读,更多相关《多元线性回归模型:估计(31页珍藏版)》请在金锄头文库上搜索。
1、1,Multiple Regression Analysis,y = b0 + b1x1 + b2x2 + . . . bkxk + u 1. Estimation,2,Parallels with Simple Regression,b0 is still the intercept b1 to bk all called slope parameters u is still the error term (or disturbance) Still need to make a zero conditional mean assumption, so now assume that E(
2、u|x1,x2, ,xk) = 0 Still minimizing the sum of squared residuals, so have k+1 first order conditions,3,the OLS regression line or the sample regression function(SRF). is the OLS intercept estimate and are the OLS slope estimates. Still use ordinary least squares to get the estimates:,4,OLS First Orde
3、r Conditions,This minimization problem can be solved using multivariable calculas. This leads to k+1 linear equation in k+1 unknown :,5,A Fitted or Predicted Value,For observation i, the fitted value is The residual for observation i is defined as in the simple regression case, The properties 1 2 3
4、The point ( ) is always on the OLS regression line:,6,Interpreting Multiple Regression,7,A “Partialling Out” Interpretation,8,“Partialling Out” continued,Previous equation implies that regressing y on x1 and x2 gives same effect of x1 as regressing y on residuals from a regression of x1 on x2 This m
5、eans only the part of xi1 that is uncorrelated with xi2 are being related to yi so were estimating the effect of x1 on y after x2 has been “partialled out”,9,Simple vs Multiple Reg Estimate,10,Goodness-of-Fit,11,Goodness-of-Fit (continued),How do we think about how well our sample regression line fi
6、ts our sample data? Can compute the fraction of the total sum of squares (SST) that is explained by the model, call this the R-squared of regression R2 = SSE/SST = 1 SSR/SST,12,Goodness-of-Fit (continued),13,More about R-squared,R2 can never decrease when another independent variable is added to a r
7、egression, and usually will increase Because R2 will usually increase with the number of independent variables, it is not a good way to compare models,14,Assumptions for Unbiasedness,Population model is linear in parameters: y = b0 + b1x1 + b2x2 + bkxk + u We can use a random sample of size n, (xi1,
8、 xi2, xik, yi): i=1, 2, , n, from the population model, so that the sample model is yi = b0 + b1xi1 + b2xi2 + bkxik + ui E(u|x1, x2, xk) = 0, implying that all of the explanatory variables are exogenous None of the xs is constant, and there are no exact linear relationships among them,15,Too Many or
9、 Too Few Variables,What happens if we include variables in our specification that dont belong? There is no effect on our parameter estimate, and OLS remains unbiased What if we exclude a variable from our specification that does belong? OLS will usually be biased,16,Omitted Variable Bias,17,Omitted
10、Variable Bias (cont),18,Omitted Variable Bias (cont),19,Omitted Variable Bias (cont),20,Summary of Direction of Bias,21,Omitted Variable Bias Summary,Two cases where bias is equal to zero b2 = 0, that is x2 doesnt really belong in model x1 and x2 are uncorrelated in the sample If correlation between
11、 x2 , x1 and x2 , y is the same direction, bias will be positive If correlation between x2 , x1 and x2 , y is the opposite direction, bias will be negative,22,The More General Case,Technically, can only sign the bias for the more general case if all of the included xs are uncorrelated Typically, the
12、n, we work through the bias assuming the xs are uncorrelated, as a useful guide even if this assumption is not strictly true,23,Variance of the OLS Estimators,Now we know that the sampling distribution of our estimate is centered around the true parameter Want to think about how spread out this dist
13、ribution is Much easier to think about this variance under an additional assumption, so Assume Var(u|x1, x2, xk) = s2 (Homoskedasticity),24,Variance of OLS (cont),Let x stand for (x1, x2,xk) Assuming that Var(u|x) = s2 also implies that Var(y| x) = s2 The 4 assumptions for unbiasedness, plus this ho
14、moskedasticity assumption are known as the Gauss-Markov assumptions,25,Variance of OLS (cont),26,Components of OLS Variances,The error variance: a larger s2 implies a larger variance for the OLS estimators The total sample variation: a larger SSTj implies a smaller variance for the estimators Linear
15、 relationships among the independent variables: a larger Rj2 implies a larger variance for the estimators,27,Misspecified Models,28,Misspecified Models (cont),While the variance of the estimator is smaller for the misspecified model, unless b2 = 0 the misspecified model is biased As the sample size
16、grows, the variance of each estimator shrinks to zero, making the variance difference less important,29,Estimating the Error Variance,We dont know what the error variance, s2, is, because we dont observe the errors, ui What we observe are the residuals, i We can use the residuals to form an estimate of the error variance,30,Error Variance Estimate (cont),df = n (k + 1), or df = n k 1 df (i.e. degrees of freedom) is the (numb