# std dev

• April 27th 2011, 10:17 AM
clarodina
std dev
Wanted to calculate the standard deviation manually quickly on quite a few data points. The data points is similar to below

Column1 Column 2
1.2
1.3 1.3
0.8 0.8
0.5 0.5
1.3 1.3
1.6 1.6
1.1

Column 2 is the same with column 1 except there is a new data point 1.1 and the first data point 1.2 is drop off. Does not want to recalculate the data one on one. Knowing the std dev of column 1, is there a quick formula to calculate the std dev of column 2 without manually recalculate each data point one on one?
• April 27th 2011, 02:07 PM
topsquark
Quote:

Originally Posted by clarodina
Wanted to calculate the standard deviation manually quickly on quite a few data points. The data points is similar to below

Column1 Column 2
1.2
1.3 1.3
0.8 0.8
0.5 0.5
1.3 1.3
1.6 1.6
1.1

Column 2 is the same with column 1 except there is a new data point 1.1 and the first data point 1.2 is drop off. Does not want to recalculate the data one on one. Knowing the std dev of column 1, is there a quick formula to calculate the std dev of column 2 without manually recalculate each data point one on one?

To be honest I don't understand what you are doing with the numbers in the example, but two points about your question in general.

If you change or add a data point, the average changes. You need the new average. Also the value of x - x(ave) changes and that will affect the entire summation. This vastly alters the formula in the sense of doing this generally.

It is much simpler just to recalculate it.

-Dan
• April 28th 2011, 12:26 AM
clarodina
What about using algrebra to simplify the formula and substitute the new value and previous stdev value? Plus or minus and differences of the averages with current data point
• April 28th 2011, 03:27 AM
topsquark
Quote:

Originally Posted by clarodina
What about using algrebra to simplify the formula and substitute the new value and previous stdev value? Plus or minus and differences of the averages with current data point

You can, of course do that. After working it out I see that the formula is actually not that bad, but the derivation is a pain in the...lower extremities.

To be specific, let's use a data set {x_n}. We wish to take one of those out, say x_1 and add a new one, say, x_{N + 1}. I'm not going to provide the details here for the sake of how long it will take to type, but if you want them I'll do it.

The final result is
$\sigma ' = \sqrt{\sigma ^2 + \frac{1}{N}(x_{N + 1} - x_1 )^2}$

where $\sigma$ is the original standard deviation.

Mind you, your problem was to add one point and delete another. If you added a point to the data set then answer might be much more complicated.

-Dan
• April 28th 2011, 06:05 AM
clarodina
The formula does not yield the correct stdev.
N represents the number of data points
x(n+1) represents new data point
x-1 represent the data point discarded
sigma is the value previously calculated

Below has the stdev is 0.51 and adding new data point 0.2 and discard the 1.1 gives stdev 0.6 but using the formula you provided gives stdev 0.649 not 0.6

1.1
0.8
0.2
1.3
1.5

stdev 0.51

0.8
0.2
1.3
1.5
0.2

stdev 0.60
• April 28th 2011, 06:29 AM
topsquark
Quote:

Originally Posted by clarodina
The formula does not yield the correct stdev.
N represents the number of data points
x(n+1) represents new data point
x-1 represent the data point discarded
sigma is the value previously calculated

Below has the stdev is 0.51 and adding new data point 0.2 and discard the 1.1 gives stdev 0.6 but using the formula you provided gives stdev 0.649 not 0.6

1.1
0.8
0.2
1.3
1.5

stdev 0.51

0.8
0.2
1.3
1.5
0.2

stdev 0.60

You appear to be correct, my formula is not accurate. I will go over it again, but my overall opinion is still to find the standard deviation from scratch and not rely on specific formulas. (Apart from the definitions, of course!)

-Dan
• April 28th 2011, 07:37 AM
topsquark
Well at least one mistake I had made in my previous derivation was to not be careful enough with the summation limits. I have now corrected that problem. The formula is now fairly more complicated than it was. Unfortunately the new one doesn't work either.

I have spent over an hour on this problem, wasted six sheets of paper, and gave myself a considerable amount of eye strain trying to check and recheck all of the gory little details. Were this a problem assigned to a college senior for a final project I might persist. However I get the impression that you are just curious. I am not going to spend any more of my own time on this. If you would like to work on it and show your work then I (or someone else) will go over it with you.

My now more experienced opinion is the same as the one I originally gave you. There is no simple formula to do this.

-Dan
• April 28th 2011, 09:05 AM
clarodina
This is not curious but required for the work. The previous figures are given to simplify the problem. The real work have 100 data points. Currently excel is not useful on the situation.

Try to simplify the formula with alegbra but did not derive the proper one to calculate the stdev. Thus decide to submit to this forum asking expert people like you.
• April 28th 2011, 01:00 PM
awkward
clarodina,

How are you calculating the standard deviation at present? Are you using a calculator with statistical functions? Pencil and paper? Or what?
• April 29th 2011, 12:33 AM
clarodina
Using a calculator with stat function would require keying of the indiviidual data point too.
• April 29th 2011, 04:35 AM
topsquark
Quote:

Originally Posted by clarodina
The real work have 100 data points. Currently excel is not useful on the situation.

Sorry if I'm being obtuse, but all you should have to do to get a standard deviation in Excel is to specify the range. Why couldn't you simply change the range of your data to the altered data set?

-Dan
• April 29th 2011, 01:35 PM
awkward
Quote:

Originally Posted by clarodina
Using a calculator with stat function would require keying of the indiviidual data point too.

I am not making a suggestion, I am asking a question: How are you computing the standard deviation at present?