1. ## Standard Deviation

How would I work out the Standard Deviation and the RMSD of

X 3 4 5 6 7 8 9
f 2 5 8 14 9 4 3

2. Have you checked this out as a start?

Standard derivation at Wikipedia

3. Yeah I did but I still dont understand. Can someone show me how to work out the Standard deviation with those numbers please?

4. Originally Posted by Yppolitia
Yeah I did but I still dont understand. Can someone show me how to work out the Standard deviation with those numbers please?
$\{ x_i \}$ is your data set and N is the number of data points.
First find the mean (average) value, $\bar{x}$.
Then calculate $x_i - \bar{x}$ for each data point.
Then calculate $(x_i - \bar{x})^2$ for each data point.
Now add all the $(x_i - \bar{x})^2$ together to get $\sum_{i=1}^N (x_i - \bar{x})^2$.
Now divide this by N to get $\sigma ^2 = \frac{1}{N} \sum_{i=1}^N (x_i - \bar{x})^2$. ( $\sigma ^2$ is called the "variance.")
Now take the square root. $\sigma = \sqrt{\frac{1}{N} \sum_{i=1}^N (x_i - \bar{x})^2}$. This is your standard deviation.

-Dan

5. I think there is something that says you can divide by N-1 as well, probably because if you have only one value, the standard derivation would become 0 if you divided by 1, while it rather should be undefined (as in division by 0 undefined ).

There is also an alternative way of calculation the standard derivation where you calculate the average value "at the same time" (though it really does the same):

$\{ x_i \}$ is your data set and n is the number of data points.
Calculate the sum $s$ of the values: $s\ =\ \sum_{i\ =\ 1}^n x_i$
The average value $a$ is obtained by the fraction $s/n$
Calculate the sum $S$ of the squares of the values: $S\ =\ \sum_{i\ =\ 1}^n x_i^2$
Calculate what the sum of the squares would have been ( $S_1$) if every value $x_i$ had been the same as the average value: $S_1/n\ =\ a^2\ \Rightarrow\ S_1\ =\ (s/n)^2 \cdot n\ =\ s^2/n$
Now let's se how much is missing to obtain the actual sum of the squares: $S - S_1$
Now if we divide the difference by n we'll get the variance: $\sigma^2\ =\ \frac{S - S_1}{n}\ =\ \frac{S - s^2/n}{n}\ =\ \frac{S}{n} - \frac{s^2}{n^2}\ =\ \frac{\displaystyle{\sum_{i\ =\ 1}^n x_i^2}}{n} - \left(\frac{\displaystyle{\sum_{i\ =\ 1}^n x_i}}{n}\right)^2$
So, the standard derivation $\sigma\ =\ \sqrt{\frac{\displaystyle{\sum_{i\ =\ 1}^n x_i^2}}{n} - \left(\frac{\displaystyle{\sum_{i\ =\ 1}^n x_i}}{n}\right)^2}$
And let's say you only want to divide by n-1: $\sigma\ =\ \sqrt{\frac{\displaystyle{\sum_{i\ =\ 1}^n x_i^2}}{n-1} - \frac{\left(\displaystyle{\sum_{i\ =\ 1}^n x_i}\right)^2}{n \cdot (n-1)}}$

This method is useful if you have a set of numbers which you already have calculated the standard derivation for with this method, and want to ad a new number to the set and at the same time fast calculate the new standard derivation. Just be sure to store $s$ and $S$ so you are able to quickly update them when a new number is added.

/Kristofer

By the way, what does "-Dan" mean?

6. Originally Posted by TriKri
I think there is something that says you can divide by N-1 as well, probably because if you have only one value, the standard derivation would become 0 if you divided by 1, while it rather should be undefined (as in division by 0 undefined ).

There is also an alternative way of calculation the standard derivation where you calculate the average value "at the same time" (though it really does the same):

$\{ x_i \}$ is your data set and n is the number of data points.
Calculate the sum $s$ of the values: $s\ =\ \sum_{i\ =\ 1}^n x_i$
The average value $a$ is obtained by the fraction $s/n$
Calculate the sum $S$ of the squares of the values: $S\ =\ \sum_{i\ =\ 1}^n x_i^2$
Calculate what the sum of the squares would have been ( $S_1$) if every value $x_i$ had been the same as the average value: $S_1/n\ =\ a^2\ \Rightarrow\ S_1\ =\ (s/n)^2 \cdot n\ =\ s^2/n$
Now let's se how much is missing to obtain the actual sum of the squares: $S - S_1$
Now if we divide the difference by n we'll get the variance squared: $\sigma^2\ =\ \frac{S - S_1}{n}\ =\ \frac{S - s^2/n}{n}\ =\ \frac{S}{n} - \frac{s^2}{n^2}\ =\ \frac{\displaystyle{\sum_{i\ =\ 1}^n x_i^2}}{n} - \left(\frac{\displaystyle{\sum_{i\ =\ 1}^n x_i}}{n}\right)^2$
So, the variance $\sigma\ =\ \sqrt{\frac{\displaystyle{\sum_{i\ =\ 1}^n x_i^2}}{n} - \left(\frac{\displaystyle{\sum_{i\ =\ 1}^n x_i}}{n}\right)^2}$
$\sigma^2$ is the variance, and $\sigma$ is the standard deviation.

RonL

7. Originally Posted by TriKri
I think there is something that says you can divide by N-1 as well, probably because if you have only one value, the standard derivation would become 0 if you divided by 1, while it rather should be undefined (as in division by 0 undefined ).
The "N-1" method gives you an unbiased estimate of the population variance
from a sample. If you have the distribution then its the "N" form that should
be used.

RonL

8. Originally Posted by TriKri
By the way, what does "-Dan" mean?
Either I'm subtracting the variable $D \cdot a \cdot n$ from somthing, or that would be my name.

-Dan

9. Originally Posted by topsquark
Either I'm subtracting the variable $D \cdot a \cdot n$ from somthing, or that would be my name.

-Dan
I know how I'm betting on this one

RonL

10. Originally Posted by topsquark
Either I'm subtracting the variable $D \cdot a \cdot n$ from somthing, or that would be my name.

-Dan
Okey! For a while I was a little insecure, but now I know.

-Dan

11. Originally Posted by CaptainBlack
$\sigma^2$ is the variance, and $\sigma$ is the standard deviation.

RonL
Thanks! Fixed.