How to measure the accuratacy of a probability prediction?
This is my first post on this web site and I would really appreciate some help as I do not know where to begin with my query.
I have a large list of customers, each of which was assigned a 'probability' score 12 months ago, based on the likihood of them renewing their contract with their supplier. I now know which of those customers renewed and which didn't.
For example, my data reads :-
ID Customer Probabiliity or renewal (1) Contract Renewed
1 , customer 1, 97.4%, 1
2 , customer 2, 94.1% , 0
10000, customer 10000, 5.4% , 1
Some customers with high probability of renewal did renew but others didn't. Conversly some customers witha low probility of renewal did renew, whilst others didnt.
Overall, historical data shows that 20% of customers would renew their contracts.
My question is :- How can I calculate whether the overall predictiive scores are betther or worse than if the predictive rates had been chosen at random?