The structure looks the same to me, but they are different as the subscript in implies the potential for different values to be subtracted. When expanding the sums we have:
Hi
In some of my statistics books, they use two versions of the standard deviation for a data set {x} with N measurements. The first version is
where mu is the mean of {x}.
The second version is
where f_i is where we put the mean, so it is a function (this is what it says in my book). This version they use to estimate the standard deviation of {x}, if it is not known beforehand. Unfortunately my book is not very explicit about:
1) What f really is
2) What the difference is between the "normal" way of writing the standard deviation (top one) and the lower one. I thought that in the lower one, the data points do not come from 1 distribution, but rather have their own -- whereas in the top formula, all data comes from 1 single distribution. But I am not sure.
I hope someone will help me by shedding light on these two questions.
Best,
Niles.
O.K now I understand what you are saying.
For example in the second equation came from a distribution with a mean . So we would be finding the standard deviation of a set of points from different distributions (or even different variables).
I need to consider this more, but my initial thinking is - is equation 2 then really finding a standard deviation at all?
My question exactly! The context in which it is presented in my book is: We look at a problem Mp = d, where d is our data, p our parameters and M our "observation matrix" (as mentioned here: Inverse problem - Wikipedia, the free encyclopedia) -- basically an overdetermined problem.
The book talks about an example where we have a vector of data d whose standard deviation we don't know. Then we try and estimate it, and the estimate is
which has the form of version #2, and this is where my question came from: How can they estimate a standard deviation like this?