Results 1 to 4 of 4

Math Help - How to measure the accuratacy of a probability prediction?

  1. #1
    Newbie
    Joined
    Nov 2010
    Posts
    2

    How to measure the accuratacy of a probability prediction?

    Hello all,

    This is my first post on this web site and I would really appreciate some help as I do not know where to begin with my query.

    I have a large list of customers, each of which was assigned a 'probability' score 12 months ago, based on the likihood of them renewing their contract with their supplier. I now know which of those customers renewed and which didn't.

    For example, my data reads :-
    ID Customer Probabiliity or renewal (1) Contract Renewed
    1 , customer 1, 97.4%, 1
    2 , customer 2, 94.1% , 0
    etc
    etc
    10000, customer 10000, 5.4% , 1


    Some customers with high probability of renewal did renew but others didn't. Conversly some customers witha low probility of renewal did renew, whilst others didnt.

    Overall, historical data shows that 20% of customers would renew their contracts.
    My question is :- How can I calculate whether the overall predictiive scores are betther or worse than if the predictive rates had been chosen at random?

    Many thanks

    Rob
    Follow Math Help Forum on Facebook and Google+

  2. #2
    Grand Panjandrum
    Joined
    Nov 2005
    From
    someplace
    Posts
    14,972
    Thanks
    4
    Quote Originally Posted by Waveylines View Post
    Hello all,

    This is my first post on this web site and I would really appreciate some help as I do not know where to begin with my query.

    I have a large list of customers, each of which was assigned a 'probability' score 12 months ago, based on the likihood of them renewing their contract with their supplier. I now know which of those customers renewed and which didn't.

    For example, my data reads :-
    ID Customer Probabiliity or renewal (1) Contract Renewed
    1 , customer 1, 97.4%, 1
    2 , customer 2, 94.1% , 0
    etc
    etc
    10000, customer 10000, 5.4% , 1


    Some customers with high probability of renewal did renew but others didn't. Conversly some customers witha low probility of renewal did renew, whilst others didnt.

    Overall, historical data shows that 20% of customers would renew their contracts.
    My question is :- How can I calculate whether the overall predictiive scores are betther or worse than if the predictive rates had been chosen at random?

    Many thanks

    Rob
    I would use the bootstrap to look at the sampling distribution of:

    \displaystyle U=\sum_{customer=1}^n |p_{customer}-r_{customer}|

    where the p_{customer} is the probability rating for the customer (divided by 100 as you have these as percentages), and r_{customer} is a 0-1 rv sampled from the observed renewal distribution.

    Then you can compare the actual U observed with this distribution.

    (I expect there are as many ways of doing this as there are statisticians)

    CB
    Follow Math Help Forum on Facebook and Google+

  3. #3
    Newbie
    Joined
    Nov 2010
    Posts
    2
    Thank you for your reply but I'm afraid I do not understand this answer.
    My understanding of mathematics is not to your standard.

    I do not know how to read this mathematical expression.
    For example :-
    - What is a bootstrap?
    - In the expression, what is U?
    - What is rv?
    - I understand the summation symbol, but what is actually being summed?
    (i.e. What is the 'n' and the 'customer=1', above and below the summation symbol?)

    - What answer does this actaually give me?

    if you are able to provide a further 'simple english' explanation, it would be very much appreciated.

    Kind regards

    Rob
    Follow Math Help Forum on Facebook and Google+

  4. #4
    Grand Panjandrum
    Joined
    Nov 2005
    From
    someplace
    Posts
    14,972
    Thanks
    4
    Quote Originally Posted by Waveylines View Post
    Thank you for your reply but I'm afraid I do not understand this answer.
    My understanding of mathematics is not to your standard.

    I do not know how to read this mathematical expression.
    For example :-
    - What is a bootstrap?
    In future use Google, but for this once the hit you need is >>here<<

    - In the expression, what is U?
    It is a rv (Random Variable) which has approximately the same distribution as the test statistic you should use.

    - What is rv?
    See above

    - I understand the summation symbol, but what is actually being summed?
    (i.e. What is the 'n' and the 'customer=1', above and below the summation symbol?)
    It is the sum over all of the customers of the absolute difference between the predicted probability and the outcome for each of the customers. This will be small if the customers with a high probability overwhelmingly renew and those with low probabilities do not renew.

    - What answer does this actaually give me?
    It gives you the basis for constructing a test. The bootstrap will give an empirical distribution for the test statistic U under the null hypothesis that there is no correlation between probability and renewals, which can be used to test the observed value of U.

    What answer does it give? Who knows that depends on your data.

    if you are able to provide a further 'simple english' explanation, it would be very much appreciated.
    Alternatively, you could consult your notes and/or text where you may be given a potted method to test data of this sort.

    CB
    Follow Math Help Forum on Facebook and Google+

Similar Math Help Forum Discussions

  1. Probability measure of condition expectation
    Posted in the Advanced Statistics Forum
    Replies: 1
    Last Post: November 18th 2011, 12:35 PM
  2. probability measure
    Posted in the Advanced Statistics Forum
    Replies: 1
    Last Post: September 3rd 2011, 02:02 AM
  3. Probability Measure Proof
    Posted in the Advanced Statistics Forum
    Replies: 1
    Last Post: October 17th 2010, 06:00 AM
  4. A quick question on probability measure
    Posted in the Advanced Statistics Forum
    Replies: 2
    Last Post: August 16th 2010, 03:48 AM
  5. Replies: 7
    Last Post: September 14th 2009, 03:54 PM

Search Tags


/mathhelpforum @mathhelpforum