There is an easy and obvious way to calculate an unbiased estimate for the variance of a population from a sample s:

$\displaystyle s^2 = \displaystyle\frac {\sum_{i=1}^n (x_i - \bar x)^2}{n-1}. \!$

Then there is the way my textbook (and others) use:

$\displaystyle s^2 = \displaystyle\frac {\sum_{i=1}^n x_i^2 - (\sum_{i=1}^n x_i)^2/n}{n-1}. \!$

Whilst I can sort of intuitively see why these might be equivalent (one is taking the sum of squared differences from the mean, the other the difference between the squared datums and the mean of the squared datums), I still don't understand why they compute the same result, and would be interested in a proof for it.

I've checked my statistics textbook, several others, Wikipedia, Wolfram MathWorld, and Google'd extensively, but haven't been able to find a proof. I suspect that this is one of those infuriating questions where the answer is everywhere, but it's impossible to conjure the right search terms. If anybody could provide a proof, this would be much appreciated

Thanks,