I think I have an excellent answer to this question.
One thing that used to bother me was something called the "quadradic mean".
As you know the "arithmetic mean" is computed as follows:
Notice if all are all equal then the arithmetic mean is the same. As we expect. Because nothing changes.
Now there are many different ways to measure means, the arithmetic is a popular one. Another one is called "quadradic mean".
Look at what we are doing. We are squaring all the numbers. Then adding them together. Dividing them by how many numbers we had. And then finally taking the square root to undo the squares.
But you might ask, why not do this:
It also makes sense! You first square, then add then extract square root and then finially divide by the number of terms.
There is one main problem with the latter. Remember I said that in the arithmetic mean if all the numbers are the same then the result will be the same? The same with the quadradic mean. Here is a proof below.
Proof: If all the terms are non-negative and equal then the quadradic mean is the same. Simple:
Exactly how we want it. Meaning if there is no change in the terms then that means should remain the same.
Now look what happens if we use your version of the quadradic mean. Will they all be the same?
Is that equal to . No! Not unless or .
My point is that if we use your version of the quadradic mean on numbers that are all the same we will end up with a smaller number. Which does not make so much sense because it should remain the same since all the numbers are unchanged.
This is how I remember the quadradic mean formula. That is whether the goes inside or outside the radical.
Same thing with the standard deviation. It looks almost like the quadradic mean thing.