# Thread: Calculate p-value for function

1. ## Calculate p-value for function

I have some data which I theorise to be uniformly distibuted in the set:
$\displaystyle {-10,-9,...,-2,-1,1,2,...,9,10}$

i.e. a 5% chance of choosing any of the non-zero integers in [-10,10]. If I have a given, know, set of observations from this random variable, let's say I have:

$\displaystyle {1, 5, -6, -3, 7, 2, 3, 3, -6, -10, -2, -5, -7, 1, 9, 10, 8}$

And I am given that each observation is independent, how do I calculate the p-value that the given set of data was observed from the given distribution?

2. Originally Posted by bumcheekcity
I have some data which I theorise to be uniformly distibuted in the set:
$\displaystyle {-10,-9,...,-2,-1,1,2,...,9,10}$

i.e. a 5% chance of choosing any of the non-zero integers in [-10,10]. If I have a given, know, set of observations from this random variable, let's say I have:

$\displaystyle {1, 5, -6, -3, 7, 2, 3, 3, -6, -10, -2, -5, -7, 1, 9, 10, 8}$

And I am given that each observation is independent, how do I calculate the p-value that the given set of data was observed from the given distribution?
If you had a larger sample then a $\displaystyle \chi^2$ test on the frequency of each symbol in your sample would be appropriate. But you have a very small sample here.

In this case you need to frame a question, something like what is the probability that in a sample of this size the highest frequency symbol occurs 2 or fewer times.

CB

3. Originally Posted by CaptainBlack
If you had a larger sample then a $\displaystyle \chi^2$ test on the frequency of each symbol in your sample would be appropriate. But you have a very small sample here.

In this case you need to frame a question, something like what is the probability that in a sample of this size the highest frequency symbol occurs 2 or fewer times.

CB
This isn't a problem on a question sheet, it's a real-world problem, so I could continue to gather an arbitrarily large amount of data. How much data would I need to gather?

I read on Wikipedia about Perason's test, which seems to be suitable. I recall doing this 6 months ago and can dig out my notes if this would be appropriate.

4. Originally Posted by bumcheekcity
This isn't a problem on a question sheet, it's a real-world problem, so I could continue to gather an arbitrarily large amount of data. How much data would I need to gather?

I read on Wikipedia about Perason's test, which seems to be suitable. I recall doing this 6 months ago and can dig out my notes if this would be appropriate.
Something like 500 would probably be OK (the idea is to get most if not all the observed frequencies over 5).

(Pearson's test is the $\displaystyle \chi^2$ test)

CB