# Thread: [SOLVED] Large samples confidence interval for difference in means

1. ## [SOLVED] Large samples confidence interval for difference in means

The following distinguishes TWO cases for large samples confidence interval for difference in means:

where Sp^2 is the pooled estimate of the common variance, n1 is the sample size from the first population, n2 is the sample size from the second population, and z_alpha/2 is 100(1-alpha/2) th percentile of the standard normal.
==========================

It seems to me that case 1 is a special case of case 2 with the population variances being equal. If this is the case, the formula for case 2 should reduce to the formula for case 1 when the population variances are equal. However, I have no way of seeing it being the case.
[aside: I am trying to cut down on the number of formulas that I have to memorize. Instead of two different formulas, if case 2 contains case 1, then I only have to memorize the general case 2 formula which is nice.]

Could somebody please show me how I can reduce case 2 to case 1?
Any help would be appreciated!

2. Originally Posted by kingwinner
The following distinguishes TWO cases for large samples confidence interval for difference in means:

where Sp^2 is the pooled estimate of the common variance, n1 is the sample size from the first population, n2 is the sample size from the second population, and z_alpha/2 is 100(1-alpha/2) th percentile of the standard normal.
==========================

It seems to me that case 1 is a special case of case 2 with the population variances being equal. If this is the case, the formula for case 2 should reduce to the formula for case 1 when the population variances are equal. However, I have no way of seeing it being the case.
[aside: I am trying to cut down on the number of formulas that I have to memorize. Instead of two different formulas, if case 2 contains case 1, then I only have to memorize the general case 2 formula which is nice.]

Could somebody please show me how I can reduce case 2 to case 1.
Any help would be appreciated!
The two cases are not equivalent. In one case you have additional information, you know the variances are equal, so the pooled variance is the best estimate of the common variance. In the second case they just happen to be the same but you don't know that so you have to estimate both separately.

CB

3. Originally Posted by CaptainBlack
The two cases are not equivalent. In one case you have additional information, you know the variances are equal, so the pooled variance is the best estimate of the common variance. In the second case they just happen to be the same but you don't know that so you have to estimate both separately.

CB
I don't get your point...so for the second case, is it only for population variance unknown and unequal? (so that case 1 and case 2 are mutually exclusive?)

4. Honestly, as both samples sizes go to infinity, the Central Limit Theorem
makes both approximations normal.
The point of the pooled estimator is the Chi-Square distribution you have in the denominatorr of the t distribution.
(But here you must assume normality of the two independent samples, besides equality of the population variances.)
That's where you really see the point of using S_p^2, when the n's are small.
This pooled variance is a weighted average of the two sample variances.
S_p^2= [(n_1-1)/(n_1+n_2-2)]S_1^2 + [(n_2-1)/(n_1+n_2-2)]S_2^2.
So if the two sample sizes are equal
S_p^2=(S_1^2 + S_2^2)/2.

5. Originally Posted by kingwinner
I don't get your point...so for the second case, is it only for population variance unknown and unequal? (so that case 1 and case 2 are mutually exclusive?)
Case 1:
The variance are unknown and not-known to be equal.

Case 2:
The variances are unknown but known to be equal.

There is no case of unknown variances known to be unequal (unless you want to develop that theory).

Also see what matheagle posted, they are asymtotically equivalent (when the variances are or just happen to be equal), which is all you need for large sample statistics.

CB

6. Satterthwaite has an approximation when
unequal sample sizes, unequal variance
which can be found in...
Student's t-test - Wikipedia, the free encyclopedia
this is really for small sample situations.
When the n or n's are large you can do anything
The CLT saves your butt, that and Slutsky's theorem.
What's interesting is that if Sigma_1=c times Sigma_2,
then we can obtain a similar result of the pooling situation.
That can be found in a lot of textbooks, usually as a homework problem.
It can be proved by the fact that adding two indep Chi-Squares we get another Chi-Square rv.
Then using that in the denominator of our T stat.

7. Originally Posted by matheagle
Satterthwaite has an approximation when
unequal sample sizes, unequal variance
which can be found in...
Student's t-test - Wikipedia, the free encyclopedia
this is really for small sample situations.
When the n or n's are large you can do anything
The CLT saves your butt, that and Slutsky's theorem.
What's interesting is that if Sigma_1=c times Sigma_2,
then we can obtain a similar result of the pooling situation.
That can be found in a lot of textbooks, usually as a homework problem.
It can be proved by the fact that adding two indep Chi-Squares we get another Chi-Square rv.
Then using that in the denominator of our T stat.
Since what we are calculating is an estimate of the variance of the difference of the two sample means almost any known relationship between the variances of the two processes will allow us (if we wish to make the effort) to develop a slightly better estimate of this variance that that used in the unknown variances case.

CB

8. Problem resolved! Thank you!