Hey Falkaine.

Basically the mean is calculated by multiplying each probability by its value and adding up the total. Notation wise we call this the expectation E[X].

When you calculate the frequency of one event and divide by the total number of events you get a probability and this weighted calculation you are doing is simply do this exact same thing (i.e. when you divide everything by the total and you see what that does to each term, it changes the frequency into a probability by normalizing the value).

Now the variance is calculated by using E([X-E[X]]^2) = E[X^2] - E[X]^2. Now E[X^2] is calculated by multiplying the probabilities by the square of the values instead of just the normal values.

So as an example, if we have a distribution with three values being 1,2,3 then E[X] = 1*P(X=1) + 2*P(X=2) + 3*P(X=3). But for E[X^2] this is 1^2*P(X=1) + 2^2*P(X=2) + 3^2*P(X=3).

Once you have the frequencies, you can calculate E[X] and E[X^2] very easily and you get the variance which is Var[X] = E[X^2] - {E[X]}^2 and the standard deviation is just the square root.

Intuitively, think about an expectation that weights things according to probability: if you have a higher probability it will attach more weight to that value than something that is smaller and what you do is you find a point (like a fulcrum or a see-saw) where half the weight is to the left and the other half is to the right where that one point balances out (like those old weight systems where to measure something, you put some weights on the right and an object on the left and you adjust the whole thing until it levels out).