# Math Help - Modelling a pong game

1. ## Modelling a pong game

OK, imagine a game called pong, which is played between two players as a sequence of points. The first player to win three points wins the match, so a match can have three, four or five points in total.

Suppose the same two players, A and B, play ten pong matches. We intend to model pong on the basis that every point is independent of every other point, with the probability of player A winning any particular point being p (so that the probability of player B winning any point is 1-p). We don’t know the value of p, but we would like to estimate it based on the results of the ten matches that these players have contested. Suppose we have the complete point-by-point results of these ten matches, showing which player won each point in sequence. For example, the data set might be as follows (where each row is a match and the winners of the points are read left to right):

ABBAA
BAAA
AABBA
BBB
BABB
AAA
AABA
BAAA
AABBB
AAA

Someone suggests that for each match we calculate the proportion of points won by player A in that match, and then we take the mean of these values across the ten matches as our estimate of p (let’s call our estimator p’). With the example data set, the proportion of points won by player A in the first match is 0.6, in the second is 0.75, in the third is 0.6 etc., and p’ would then be calculated as the mean of these ten values, in this case 0.61.

Someone else suggests defining p’ as the total number of points won by player A in the ten matches divided by the total number of points played in the ten matches. With the example data set, this would give a result of 24/40 = 0.6 as our estimate of p.

I am wondering about these two suggestions for estimating p, and in particular to determine whether either estimate of p is unbiased. (We say that p’ is an unbiased estimator of p if and only if, for every value of p with 0<=p<=1, the expected value of p’ is p.) Are there any alternative (and perhaps better) suggestions for estimating p based on the data? I would ideally prefer unbiased estimators to biased ones, but why would any unbiased estimator be any better than any other one? Might a biased estimator actually be better in practical terms than an unbiased one?

Thanks!

2. ## Re: Modelling a pong game

The fact that your data is grouped into matches really has no effect on the value of p. p will have that value no matter how you arrange the rules of how a match is won via winning individual points. The minimum error estimate of p is thus going to use all of the data as a big pool and ignore the fact that it is grouped into matches.

The estimate is obviously

$\hat{p} = \frac{ \mbox{total points won} }{\mbox{total points played}}$

and this is also an unbiased estimate.

The problem is much more interesting when you don't have access to individual point results but only have data on the results of matches. Then you will probably run into a biased estimator.

3. ## Re: Modelling a pong game

Brilliant! I too thought that the minimum error estimate of p includes all the data and that this consequently was the unbiased estimate.

I will take p = 0.6 to be the expected value then therefore and any values that deviate from this will therefore be biased.

I am now wondering whether;

1. I might be able to have some help defining p and p' mathematically given the definitions in the above problem.
2. There are any alternative ways of defining p based on the data. I am possibly thinking about median-unbiased estimators and a bayesian view?
3. A biased estimator might actually be better in practical terms than an unbiased one?

4. ## Re: Modelling a pong game

I am trying to define p and p' mathematically. What is confusing me is that in the first half of the first definition p is defined as the mean of the means. My attempt at writing this mathematically would be p = ∑(∑x/n)/m. What then is p'? I can't seem to see the relation between p and p'. It gets even more confusing in the latter part of the definition when it goes on to say that using that definition then p' would then be calculated as the mean of the means and give a value of p' = 0.61. It seems to me that the two definitions are somewhat jumbled up.

The same problem is repeated in the second definition where it defines p' as p'=∑x/n however then goes onto say that p = 24/40 = 0.6. So is it p' or p that is 0.6?

I am either being quite stupid or there is some kind of trick here. Is there anyone out there with experience of using these things who knows how p and p' should be defined?