# Thread: Mean of the residuals (e-bar)

1. ## Mean of the residuals (e-bar)

So I can prove that the sum of the residuals equals zero.. but how do I prove that the mean of e equals zero? (e is the residuals in this case, not the irrational e)

I've tried but all I end up doing is going around in a circle.

2. Let R be the random variable denoting the residual error. Then
$R_i = X_i - \frac{1}{n}\sum_{j=1}^n X_j, i=1, ... ,n$
Then it follows that
$E[R_i] = E[X_i] - \frac{1}{n} \sum_{j=1}^n E[X_j]$
$E[R_i] = \mu - \frac{1}{n} \sum_{j=1}^n \mu = 0$

3. I proved the sum of the residuals is zero in another post here.
BUT it's not true if you don't have an intercept term in your model.
HERE it is
http://www.mathhelpforum.com/math-he...-question.html

4. I have been told that the definition of the residual is:

e = y(subscript i) - y(subscript i, hat)

(If that makes sense)

I understand the proof given by cl85 but I'm not sure how to fit it to the equation for e that I have been given.

5. I did indeed look at it, but that's the proof for the sum of the residuals - something I already understand how to do. I need help with the expected value of the residual.

6. I do not even know what your model is, nor if you understand how to do least squares with matrices.
But here's simple proof for ANY model that the expected value of the residuals is zero.

The general model is $Y=X\beta +\epsilon$ where the $\epsilon$ is a column vector usually normal, with ZERO means.

HENCE $E(Y)=X\beta +E(\epsilon)=X\beta$

NOW the least squares fit is $\hat\beta=(X^tX)^{-1}X^tY$

and it's real easy to prove that $E(\hat\beta)=\beta$.

Our least squares fit is $\hat Y=X\hat\beta$ and $E(\hat Y)=XE(\hat\beta)=X\beta$.

SO the expected value of our residuals is $X\beta -X\beta=0$.

7. Ok, now I'm just red in the face. I feel like such an idiot.

matheagle - You are brilliant. You really are. And.... yeah, I feel like an idiot. I get you now.

Just know that this girl is very, very thankful to you.

8. I'm flattered, but I don't know what I did.
Don't let Mr Fantasy see your comments, he thinks I'm a prima dona.
(And MOO may also be jealous too, so keep your comments to a minimum.)
I thought that you wanted to see that the sum was zero.
That's the usual question here.
Without knowing your model I wasn't sure what you wanted me to do.
I can easily prove this for any model w/o matrices.
BUT matrices is the way to go.
I'm typing the exam I'm giving on this topic right now.

9. Is it, uhh... at all possible for you to prove it without matrices?
Just that this question was given to us before we started using matrices, so I can't use matrices.

And as for the model, our model for the regression line is:
$y=\hat\beta_0+\hat\beta_1 x$
with residual:
$e_i=y_i-\hat y_i$

Sorry I didn't give that before.

10. Sure, it works either way.
BUT it doesn't matter what your model is.
Have you proved that $E(\hat\beta_0) =\beta_0$ and $E(\hat\beta_1) =\beta_1$ that the estimators are unbiased for these parameters?

Your original model is $Y=\beta_0+\beta_1 x+\epsilon$

where the only random variable is the $\epsilon$ and it has mean zero.

THUS $E(Y)=\beta_0+\beta_1 x$.

NOW if you have proved $E(\hat\beta_0) =\beta_0$ and $E(\hat\beta_1) =\beta_1$ it follows from $\hat Y=\hat\beta_0+\hat\beta_1 x$ that $E(\hat Y)=\beta_0+\beta_1 x$.

I just did the scratch work to prove $E(\hat\beta_0) =\beta_0$ and $E(\hat\beta_1) =\beta_1$ via the SSxy and SSxx formulas.

11. Thank you, thank you, thank you, thank you, thank you, thank you, THANK YOU.

Mr Fantasy and MOO can be jealous for all I care. This helps so much.

Sidenote: I hate that whenever I see an answer to a proof, it always looks so easy.

12. do you need this with $\hat\beta_0 =\bar Y-\hat\beta_1\bar x$ and $\hat\beta_1 ={SS_{xy}\over SS_{xx}}$...

Start with $Y_i=\beta_0 +\beta_1 x_i+\epsilon_i$ and $\bar Y=\beta_0 +\beta_1 \bar x+\bar\epsilon$

$E(Y_i)=\beta_0 +\beta_1 x_i+0$

so $E(\hat\beta_1) ={\sum_{i=1}^n(x_i-\bar x)E(Y_i)\over \sum_{i=1}^n(x_i-\bar x)^2}$

$={\sum_{i=1}^n(x_i-\bar x)(\beta_0 +\beta_1 x_i)\over \sum_{i=1}^n(x_i-\bar x)^2} =\beta_0 {\sum_{i=1}^n(x_i-\bar x)\over \sum_{i=1}^n(x_i-\bar x)^2}+\beta_1{\sum_{i=1}^n(x_i-\bar x)^2\over \sum_{i=1}^n(x_i-\bar x)^2}=\beta_1$

Next, $E(\bar Y)=\beta_0 +\beta_1 \bar x+0$

SO, using the fact that $\hat\beta_1$ is unbiased for $\beta_1$

we have $E(\hat\beta_0)=E(\bar Y) -E(\hat\beta_1)\bar x= \beta_0 +\beta_1 \bar x-\beta_1 \bar x= \beta_0$.

13. Corrections here...

the model is $y=\beta_0+\beta_1 x+\epsilon$

and the least squares fit (line through the data) is $\hat y=\hat\beta_0+\hat\beta_1 x$

The $\hat\beta$'s are the estimators (random variables) of the unknown parameters (constants) $\beta$'s.

Originally Posted by blueirony
Is it, uhh... at all possible for you to prove it without matrices?
Just that this question was given to us before we started using matrices, so I can't use matrices.

And as for the model, our model for the regression line is:
$y=\hat\beta_0+\hat\beta_1 x$
with residual:
$e_i=y_i-\hat y_i$

Sorry I didn't give that before.