# Proof of equivelance of two different ways to calculate variance

• Aug 16th 2010, 10:39 AM
anoia
Proof of equivelance of two different ways to calculate variance
There is an easy and obvious way to calculate an unbiased estimate for the variance of a population from a sample s:

$s^2 = \displaystyle\frac {\sum_{i=1}^n (x_i - \bar x)^2}{n-1}. \!$

Then there is the way my textbook (and others) use:

$s^2 = \displaystyle\frac {\sum_{i=1}^n x_i^2 - (\sum_{i=1}^n x_i)^2/n}{n-1}. \!$

Whilst I can sort of intuitively see why these might be equivalent (one is taking the sum of squared differences from the mean, the other the difference between the squared datums and the mean of the squared datums), I still don't understand why they compute the same result, and would be interested in a proof for it.

I've checked my statistics textbook, several others, Wikipedia, Wolfram MathWorld, and Google'd extensively, but haven't been able to find a proof. I suspect that this is one of those infuriating questions where the answer is everywhere, but it's impossible to conjure the right search terms. If anybody could provide a proof, this would be much appreciated :)

Thanks,
• Aug 16th 2010, 11:49 AM
Soroban
Hello, anoia!

Quote:

There are two formulas for the variance:

$\displaystyle [1]\;s^2 \;=\; \frac{1}{n-1}\,\sum (x_i - \overline x)^2$

$\displaystyle[2]\; s^2 \;=\; \frac{1}{n-1}\bigg[\sum x_i^2 - \frac{(\sum x_i)^2}{n}\bigg]$

All summations are from $i = 1$ to $i = n.$

Recall that: . $\overline x \:=\:\dfrac{\sum x_i}{n}$

. . $\dfrac{1}{n-1}\sum(x_i - \overline x)^2 \;=\;\dfrac{1}{n-1}\sum\bigg[x_i^2 - 2\overline x x_i + \overine x ^2\bigg]$

. . . . $=\; \dfrac{1}{n-1}\bigg[\sum x_i^2 - 2\overline x \sum x_i + \overline x^2 \sum 1 \bigg]$
. . . . . . . . . . . . . . . . . . . $\searrow$

. . . . $=\;\dfrac{1}{n-1}\bigg[\sum x_i^2 - 2\left(\dfrac{\sum x_i}{n}\right)\left(\sum x_i\right) + n\overline x^2\bigg]$
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . $\swarrow$

. . . . $=\; \dfrac{1}{n-1}\bigg[\sum x_i^2 - \dfrac{2\left(\sum x_i\right)^2}{n} + n\left(\dfrac{\sum x_i}{n}\right)^2\bigg]$

. . . . $=\; \dfrac{1}{n-1}\bigg[\sum x_i^2 - \dfrac{2\left(\sum x_i\right)^2}{n} + \dfrac{\left(\sum x_i)^2}{n}\bigg]$

. . . . $=\; \dfrac{1}{n-1}\bigg[\sum x_i^2 - \dfrac{\left(\sum x_i\right)^2}{n} \bigg]$