# Difference between Standard Deviations?

• Sep 12th 2011, 11:59 AM
Niles_M
Difference between Standard Deviations?
Hi

In some of my statistics books, they use two versions of the standard deviation for a data set {x} with N measurements. The first version is

$
\sigma^2 = \frac{1}{N}\sum_i{(x_i-\mu)^2}
$

where mu is the mean of {x}.

The second version is

$
\sigma^2 = \frac{1}{N}\sum_i{(x_i-f_i)^2}
$

where f_i is where we put the mean, so it is a function (this is what it says in my book). This version they use to estimate the standard deviation of {x}, if it is not known beforehand. Unfortunately my book is not very explicit about:

1) What f really is
2) What the difference is between the "normal" way of writing the standard deviation (top one) and the lower one. I thought that in the lower one, the data points do not come from 1 distribution, but rather have their own -- whereas in the top formula, all data comes from 1 single distribution. But I am not sure.

I hope someone will help me by shedding light on these two questions.

Best,
Niles.
• Sep 12th 2011, 01:57 PM
pickslides
Re: Difference between Standard Deviations?
The structure looks the same to me, but they are different as the subscript in $f_i$ implies the potential for different values to be subtracted. When expanding the sums we have:

$\displaystyle\sigma^2 = \frac{1}{N}\sum_i{(x_i-\mu)^2} =\frac{1}{N}((x_1-\mu)^2+(x_2-\mu)^2+(x_3-\mu)^2+\dots )$

$\displaystyle\sigma^2 = \frac{1}{N}\sum_i{(x_i-f_i)^2} = \frac{1}{N}((x_1-f_1)^2+(x_2-f_2)^2+(x_3-f_3)^2+\dots)$
• Sep 12th 2011, 09:28 PM
Niles_M
Re: Difference between Standard Deviations?
Quote:

Originally Posted by pickslides
The structure looks the same to me, but they are different as the subscript in $f_i$ implies the potential for different values to be subtracted. When expanding the sums we have:

$\displaystyle\sigma^2 = \frac{1}{N}\sum_i{(x_i-\mu)^2} =\frac{1}{N}((x_1-\mu)^2+(x_2-\mu)^2+(x_3-\mu)^2+\dots )$

$\displaystyle\sigma^2 = \frac{1}{N}\sum_i{(x_i-f_i)^2} = \frac{1}{N}((x_1-f_1)^2+(x_2-f_2)^2+(x_3-f_3)^2+\dots)$

Thanks, do you know if I am correct about saying that it might be because the data in #1 comes from the same distribution, and each point in #2 has its own distribution?
• Sep 12th 2011, 09:34 PM
pickslides
Re: Difference between Standard Deviations?
I'm not sure what you are trying to say here.
• Sep 12th 2011, 09:43 PM
Niles_M
Re: Difference between Standard Deviations?
So in the first version, all the data points {x} originate from one single distribution with the same mean, whereas in #2 each data points originates from its own "unique" distribution with a unique mean?
• Sep 12th 2011, 09:50 PM
pickslides
Re: Difference between Standard Deviations?
Quote:

Originally Posted by Niles_M
So in the first version, all the data points {x} originate from one single distribution with the same mean, whereas in #2 each data points originates from its own "unique" distribution with a unique mean?

O.K now I understand what you are saying.

For example in the second equation $x_i$ came from a distribution with a mean $f_i$ . So we would be finding the standard deviation of a set of points from different distributions (or even different variables).

I need to consider this more, but my initial thinking is - is equation 2 then really finding a standard deviation at all?
• Sep 12th 2011, 09:56 PM
Niles_M
Re: Difference between Standard Deviations?
Quote:

Originally Posted by pickslides
O.K now I understand what you are saying.

I need to consider this more, but my initial thinking is - is equation 2 then really finding a standard deviation at all?

My question exactly! The context in which it is presented in my book is: We look at a problem Mp = d, where d is our data, p our parameters and M our "observation matrix" (as mentioned here: Inverse problem - Wikipedia, the free encyclopedia) -- basically an overdetermined problem.

The book talks about an example where we have a vector of data d whose standard deviation we don't know. Then we try and estimate it, and the estimate is

$
\displaystyle s^2 = \frac{1}{N}\sum_i{(d_i-(Mp)_i)^2}
$

which has the form of version #2, and this is where my question came from: How can they estimate a standard deviation like this?