Linear Regression: Mean square error (MSE)
Simple linear regression model:
Y_i = β0 + β1*X_i + ε_i , i=1,...,n
where n is the number of data points, ε_i is random error
Let σ^2 = V(ε_i) = V(Y_i)
Then an unbiased estimator of σ^2 is
s^2 = (1/n-2)[∑(e_i)^2]
where e_i's are the residuals
s^2 is called the "mean square error" (MSE).
My concerns:
1) The GENERAL formula for sample variance is s^2 = (1/n-1)[∑(y_i - y bar)^2], it's defined on the first pages of my statistics textbook, I've been using this again and again, now I don't see how this general formula (which always holds) can reduce to the formula in red above? How come we have (n-2) and e_i in the formula for s^2?
2) From what I've learnt in previous stat courses, the "mean square error" of a point estimator is by definition
MSE(θ hat) = E[(θ hat - θ)^2]
Is this the same MSE as the one in red above? Are they related at all?
Any help is greatly appreciated!
note: also under discussion in Talk Stats forum