# Math Help - Variance of sample mean (not independent)

1. ## Variance of sample mean (not independent)

Suppose that the random variables $X_1,X_2,...,X_n$ are identically distributed, with mean $\mu$ and variance $\sigma ^2$, but not independent. Assume that the correlation between any pair is equal to $\rho$. That is the $corr(X_i , X_j)= \rho$ for $i$ does not equal to $j$.

i) What is $Var(\bar{X})$ when $\rho=1$? Explain

ii) Use the result you have derived to comment on how small $\rho$ can be in this situation. Explain.

So I got $Var(\bar{X})= \frac{\sigma ^2}{n}+\frac{(n-1) \rho}{n}$

i) If $\rho=1$, then I get $Var(\bar{X})= \frac{\sigma ^2+(n-1)}{n}$
What would the reason be behind that? I know that if $\rho=1$, then is there an explaination I can say about this?

for ii), I'm not quite sure how to approach this one. Can someone help me. All I know is that if \rho=0, then I will get the well-known result $\frac{\sigma ^2}{n}$

Thank-you!

2. ## Re: Variance of sample mean (not independent)

Hey lpd.

The first mathematical thing would be that the variance of the estimator would not be consistent which means theres no real point in trying to use it for inference purposes.

For i) it means that if n dominates the variance then the variances tends to 1 which lends itself to issues of consistency.

For ii) we need to consider what a fixed variance implies about this particular estimator if we have full correlation: this means that no matter what we do we get a fixed lower bound for the variance which means that once we get to this point, the uncertainty doesn't improve and thus this the limit of certainty that comes to estimating the mean when we have correlated variables.

If we think of this result it says that at this limit, the distribution of the mean is the same which says that when things are correlated, there is more uncertainty when it comes to figuring out this parameter and that regardless of what we do, we can't get better than this.

If our distribution is from a normal, then it means the best we can do for say a 95% interval (even with enough observations to get close to this interval), is that the best we can do is have our interval for the difference of the parameters between -1.96 and +1.96 (or close enough to it).

So correlated variables actually interfere with getting a consistent estimator of our population mean parameter and this can help demonstrate how the assumption of I.I.D (or at least one that is close enough) is important for consistent estimation of the population mean.

3. ## Re: Variance of sample mean (not independent)

Originally Posted by chiro
Hey lpd.

The first mathematical thing would be that the variance of the estimator would not be consistent which means theres no real point in trying to use it for inference purposes.

For i) it means that if n dominates the variance then the variances tends to 1 which lends itself to issues of consistency.

For ii) we need to consider what a fixed variance implies about this particular estimator if we have full correlation: this means that no matter what we do we get a fixed lower bound for the variance which means that once we get to this point, the uncertainty doesn't improve and thus this the limit of certainty that comes to estimating the mean when we have correlated variables.

If we think of this result it says that at this limit, the distribution of the mean is the same which says that when things are correlated, there is more uncertainty when it comes to figuring out this parameter and that regardless of what we do, we can't get better than this.

If our distribution is from a normal, then it means the best we can do for say a 95% interval (even with enough observations to get close to this interval), is that the best we can do is have our interval for the difference of the parameters between -1.96 and +1.96 (or close enough to it).

So correlated variables actually interfere with getting a consistent estimator of our population mean parameter and this can help demonstrate how the assumption of I.I.D (or at least one that is close enough) is important for consistent estimation of the population mean.
I actually derived the variance wrongly.

it should be $Var(\bar{X})= \frac{\sigma ^2}{n}+\frac{(n-1) \rho}{n}\sigma ^2$

so if $\rho=1$, then I get $Var(\bar{X})= \sigma ^2$

4. ## Re: Variance of sample mean (not independent)

Thanks for the correction: it's a good reminder to keep sharp and switched on (for everyone that is).

In that case, it means your estimation is completely useless regardless of how many observations have because they are all going to be the same observation.

So if you have this scenario (provided your results are correct and no I haven't checked I'm only going off what you're update is) then it means that your estimator is not even a consistent one (variance wise) so trying to estimate the population mean is as good as having an estimate with only 1 independent observation drawn from your distribution which is kind of pointless.

So the question you need to conclude is based on this, if all observations are essentially correlated, is this the same as having only one normal independent variable in terms of estimation and inference?

5. ## Re: Variance of sample mean (not independent)

That makes sense, a far point you made. But I guess, what is the reason behind that? Its just so weird how it turns out to be like that! and how on independent variable will get the same thing (as you said)

and for ii) but how small can rho be?

6. ## Re: Variance of sample mean (not independent)

Well you might want to ask if every variable has directly positive correlation of 1, then is it purely a function of the other variable and if both distributions are exactly the same, then does it make sense if they all refer to exactly the same variable?

7. ## Re: Variance of sample mean (not independent)

Originally Posted by chiro
Well you might want to ask if every variable has directly positive correlation of 1, then is it purely a function of the other variable and if both distributions are exactly the same, then does it make sense if they all refer to exactly the same variable?
That makes a lot of sense now! Thank-you!

But what I am not quite sure of is
ii)Use the result you have derived to comment on how small $\rho$ can be in this situation. Explain.

I tried simplying it to
$Var(\bar{X})= \frac{\sigma ^2}{n}(1+(n-1) \rho)$

and said something like $\frac{\sigma ^2}{n}(1- \rho + n \rho)$
and then said $1- \rho + n \rho =n$

Then I am not sure what to do... can someone help me here. I'm not sure how small $\rho$ can be in this situation and the reason behind it... thanks!

8. ## Re: Variance of sample mean (not independent)

Well from an estimator point of view, you want to evaluate consistency of the estimator.

If an estimator is consistent, then it means that the variance will approach zero as the sample size gets big enough.

The thing you probably want to look at is what kind of conditions the correlation should be to make this assumption either be adhered to, or close enough that it gets adhered to.

So for the (1 - p + np) term, you want to see what kind of value p should have so that its low enough so that it doesn't dominate the variance as to oppose consistency.