Results 1 to 8 of 8

Math Help - Calculating a Weighted Average and its Variance

  1. #1
    Newbie
    Joined
    Aug 2007
    Posts
    8

    Calculating a Weighted Average and its Variance

    Hi I am not sure this constitutes an advanced query or a basic one. It has me beaten! I am trying to find a formula and so construct a computer algorithm for calculating the weighted average and variance (taking into account the weights) for a series of data. The application is a series of stock data (weighted by volume) and for computational efficiency I'd like to use an algorithm that does not need to re-iterate the whole series when a new piece of data is added to the sample.

    For a non weighted series its straight forward for each point (n) in the series the variance and mean are

    for each piece of data in the series
    n=n+1
    Mean = Sum/n
    Variance = (SumSqrs - Square(Sum)/n)/(n-1)
    end

    When new data arrives the loop can be run once more for the new data. For a weighted series I need to scale each point by

    Weight(n)/Weight(total)

    The problem I have is that each time a new piece of data is added the Total Weight increases. As far as I can see this will require each point in the series to be re-weighed using the new total weighting?

    My maths isn't good enough to do it by raw brain power (I tried). I also used a trial and error approach ...adding weighting in different places...keeping a sum of the weights etc. But while close it dose not appear to be correct.

    There are a number of clever algorithms Algorithms for calculating variance - Wikipedia, the free encyclopedia for calculating variance in no weighted series but I have not found anything for weighted?

    Any suggestions would be welcome
    Thanks in advance,
    Nick.
    Follow Math Help Forum on Facebook and Google+

  2. #2
    MHF Contributor
    Joined
    Aug 2007
    From
    USA
    Posts
    3,110
    Thanks
    2
    One thing that may help you is a small generalization. The standard or familiar algorithm for average (and its related algorithm for variance) can be seen as a Weighted Average that just happens to have all weights equivalent and equal to 1/n.

    It is a VERY interesting exercise. See if you can get it with that one hint.
    Follow Math Help Forum on Facebook and Google+

  3. #3
    Senior Member
    Joined
    Apr 2006
    Posts
    399
    Awards
    1
    Quote Originally Posted by TKHunny View Post
    One thing that may help you is a small generalization. The standard or familiar algorithm for average (and its related algorithm for variance) can be seen as a Weighted Average that just happens to have all weights equivalent and equal to 1/n.

    It is a VERY interesting exercise. See if you can get it with that one hint.
    Unfortunately this formula from the algorithms page does not work just by replacing the n and n+1 with weights. It's not clear to me how it is generalized. But see my next post for an algorithm that does not use this formula.

    Last edited by JakeD; August 27th 2007 at 08:33 PM.
    Follow Math Help Forum on Facebook and Google+

  4. #4
    Senior Member
    Joined
    Apr 2006
    Posts
    399
    Awards
    1
    Quote Originally Posted by BlowFish View Post
    Hi I am not sure this constitutes an advanced query or a basic one. It has me beaten! I am trying to find a formula and so construct a computer algorithm for calculating the weighted average and variance (taking into account the weights) for a series of data. The application is a series of stock data (weighted by volume) and for computational efficiency I'd like to use an algorithm that does not need to re-iterate the whole series when a new piece of data is added to the sample.

    For a non weighted series its straight forward for each point (n) in the series the variance and mean are

    for each piece of data in the series
    n=n+1
    Mean = Sum/n
    Variance = (SumSqrs - Square(Sum)/n)/(n-1)
    end

    When new data arrives the loop can be run once more for the new data. For a weighted series I need to scale each point by

    Weight(n)/Weight(total)

    The problem I have is that each time a new piece of data is added the Total Weight increases. As far as I can see this will require each point in the series to be re-weighed using the new total weighting?

    My maths isn't good enough to do it by raw brain power (I tried). I also used a trial and error approach ...adding weighting in different places...keeping a sum of the weights etc. But while close it dose not appear to be correct.

    There are a number of clever algorithms Algorithms for calculating variance - Wikipedia, the free encyclopedia for calculating variance in no weighted series but I have not found anything for weighted?

    Any suggestions would be welcome
    Thanks in advance,
    Nick.
    Here is an algorithm adapted from Algorithm III from the Wikipedia page. You can use it to add data points one at a time. But it is not optimized and does not minimize rounding error like Algorithm III.

    Code:
    mean_old = 0;
    mean_new = 0;
    sum = 0;
    wgt_old = 0;
    wgt_new = 0;
    foreach x in data and w in weights:
     wgt_new = wgt_old + w;
     mean_new = (wgt_old*mean_old + w*x)/wgt_new;
     sum = sum + wgt_old*mean_old^2 + w*x^2 - wgt_new*mean_new^2;
     wgt_old = wgt_new;
     mean_old = mean_new;
    end for;
    variance = sum/wgt_new;
    Follow Math Help Forum on Facebook and Google+

  5. #5
    Newbie
    Joined
    Aug 2007
    Posts
    8
    Thanks for the replies. Interesting way to think of it TKHunny, when I couldn't 'get it' from straight brain power (my maths was never particularly advanced and is very rusty). I did try to come up with a solution empirically (OK trial and error). I think even in light of your way to view it, the empirical approach would have got me there if I was going to get there.

    JakeD thanks I'll give that a try. It looks pretty similar to one of the things I came up with (trial and error). I'll give it a try, see how it looks, and report back.

    Cheers,
    Nick.
    Last edited by BlowFish; August 28th 2007 at 12:56 AM.
    Follow Math Help Forum on Facebook and Google+

  6. #6
    Newbie
    Joined
    Aug 2007
    Posts
    8
    Well coded the new algorithm and it looks great on the variance (well actually the standard deviation but it looks great).

    One last thing is confusing me. The mean looks identical to an older algorithm I was using and for the life of me I can't understand why. Essentially it looks the same as simply doing :-

    n = 1 to sample size

    totalweight = totalweight + w(n)
    mean = mean + value(n) * w(n) / totalweight

    I thought that when a new piece of data arrives all the preceding weights would need to be changed as the universe of totalweight has changed. This does not seem to be the case. The algorithm above uses the total weight up to the data point of calculation not the total weight of the universe of data.

    Anyway I'm very please with how the results are looking thanks ever so much. Please forgive my 'algorithmic' approach rather than pure maths. For me it is easier to think in this sort of fashion.

    Cheers,
    Nick.
    Follow Math Help Forum on Facebook and Google+

  7. #7
    Senior Member
    Joined
    Apr 2006
    Posts
    399
    Awards
    1
    Quote Originally Posted by JakeD View Post
    Here is an algorithm adapted from Algorithm III from the Wikipedia page. You can use it to add data points one at a time. But it is not optimized and does not minimize rounding error like Algorithm III.

    Code:
    mean_old = 0;
    mean_new = 0;
    sum = 0;
    wgt_old = 0;
    wgt_new = 0;
    foreach x in data and w in weights:
     wgt_new = wgt_old + w;
     mean_new = (wgt_old*mean_old + w*x)/wgt_new;
     sum = sum + wgt_old*mean_old^2 + w*x^2 - wgt_new*mean_new^2;
     wgt_old = wgt_new;
     mean_old = mean_new;
    end for;
    variance = sum/wgt_new;
    Quote Originally Posted by BlowFish View Post
    Well coded the new algorithm and it looks great on the variance (well actually the standard deviation but it looks great).

    One last thing is confusing me. The mean looks identical to an older algorithm I was using and for the life of me I can't understand why. Essentially it looks the same as simply doing :-

    n = 1 to sample size

    totalweight = totalweight + w(n)
    mean = mean + value(n) * w(n) / totalweight

    I thought that when a new piece of data arrives all the preceding weights would need to be changed as the universe of totalweight has changed. This does not seem to be the case. The algorithm above uses the total weight up to the data point of calculation not the total weight of the universe of data.

    Anyway I'm very please with how the results are looking thanks ever so much. Please forgive my 'algorithmic' approach rather than pure maths. For me it is easier to think in this sort of fashion.

    Cheers,
    Nick.
    The code I gave for the mean has several differences with your code: both old and new weights are used; the old mean is multiplied by the old weights; parentheses are used so that the entire sum is divided by the new weights.

    Code:
    wgt_new = wgt_old + w;
    mean_new = (wgt_old*mean_old + w*x)/wgt_new;
    I wonder how your testing could have produced a good variance using an incorrect mean when the variance calculation uses the mean.

    I glad this approach works for you.
    Follow Math Help Forum on Facebook and Google+

  8. #8
    Newbie
    Joined
    Aug 2007
    Posts
    8
    Actually its the means that look very similar. I couldn't get the variance to match using any of my methods. Yours looks spot on without normalising or any other pre-processing. I will perhaps mess around a little further but I now have what I need!

    Thanks again,
    cheers,
    Nick.
    Follow Math Help Forum on Facebook and Google+

Similar Math Help Forum Discussions

  1. Weighted average?
    Posted in the Business Math Forum
    Replies: 0
    Last Post: October 24th 2011, 03:05 AM
  2. Weighted Average
    Posted in the Statistics Forum
    Replies: 2
    Last Post: April 19th 2011, 06:25 PM
  3. Calculate Weighted Average
    Posted in the Statistics Forum
    Replies: 2
    Last Post: August 3rd 2010, 10:44 PM
  4. Weighted Average / Pop Mean
    Posted in the Advanced Statistics Forum
    Replies: 1
    Last Post: February 16th 2010, 06:03 PM
  5. weighted average
    Posted in the Business Math Forum
    Replies: 1
    Last Post: January 20th 2010, 12:02 PM

Search Tags


/mathhelpforum @mathhelpforum