# Math Help - Hypothesis Testing: Neyman Pearson Lemma

1. ## Hypothesis Testing: Neyman Pearson Lemma

Suppose that Y1,Y2,...,Yn are iid Poisson(lambda)
We want to test Ho:lambda=lambda_o vs Ha:lambda=lambda_a (where lambda_a>lambda_o)
Using the Neyman Pearson Lemma, find the most powerful test for alpha=0.05
(Hint: (Y1+Y2+...+Yn) ~ Possion(n*lambda) )
=============================================

So we need to solve P([L(lambda_o)/L(lambda_a)]<k | Ho is true) = alpha
But I am having some trouble finding the statistic.
L(lambda_o)/L(lambda_a)=(lambda_o)^(y1+...+yn)/(lambda_a)^(y1+...+yn)
How can I proceed from here? I am stuck...
Thanks for any help!

2. Originally Posted by kingwinner
Suppose that Y1,Y2,...,Yn are iid Poisson(lambda)
We want to test Ho:lambda=lambda_o vs Ha:lambda=lambda_a (where lambda_a>lambda_o)
Using the Neyman Pearson Lemma, find the most powerful test for alpha=0.05
(Hint: (Y1+Y2+...+Yn) ~ Possion(n*lambda) )
=============================================

So we need to solve P([L(lambda_o)/L(lambda_a)]<k | Ho is true) = alpha
But I am having some trouble finding the statistic.
L(lambda_o)/L(lambda_a)=(lambda_o)^(y1+...+yn)/(lambda_a)^(y1+...+yn)
How can I proceed from here? I am stuck...
Thanks for any help!
Put $Z=Y_1+\ ... \ +Y_n$, then:

$L(\lambda_0)/L(\lambda_a)= (\lambda_o)^Z/(\lambda_a)^Z$

and so:

$
P([L(\lambda_o)/L(\lambda_a)]$
.

The condition: $(\lambda_o)^Z/(\lambda_a)^Z can be rewritten:

$Z \ln(\lambda_0/\lambda_a) < \ln(k)$

and as $\lambda_0<\lambda_a$ we have $\ln(\lambda_0/\lambda_a)<0$ and so the inequality above becomes:

$
Z > \ln(k)/[\ln(\lambda_0/\lambda_a)]
$

and so:

$
P([L(\lambda_o)/L(\lambda_a)] \ln(k)/[\ln(\lambda_0/\lambda_a)]\ |H_0)
$
.

and you know the distribution of $Z$ given $H_0$.

CB

3. Originally Posted by CaptainBlack
Put $Z=Y_1+\ ... \ +Y_n$, then:

$L(\lambda_0)/L(\lambda_a)= (\lambda_o)^Z/(\lambda_a)^Z$

and so:

$
P([L(\lambda_o)/L(\lambda_a)]$
.

The condition: $(\lambda_o)^Z/(\lambda_a)^Z can be rewritten:

$Z \ln(\lambda_0/\lambda_a) < \ln(k)$

and as $\lambda_0<\lambda_a$ we have $\ln(\lambda_0/\lambda_a)<0$ and so the inequality above becomes:

$
Z > \ln(k)/[\ln(\lambda_0/\lambda_a)]
$

and so:

$
P([L(\lambda_o)/L(\lambda_a)] \ln(k)/[\ln(\lambda_0/\lambda_a)]\ |H_0)
$
.

and you know the distribution of $Z$ given $H_0$.

CB
The distribution of Z given H_o is Possion(n*lambda_o).
But how can we express the final answer? I can't think of a nice way of calculating it...there are infinitely many terms.

Also, I checked a couple of sources, and the form of the lemma is slightly inconsistent:
(i) P([L(lambda_o)/L(lambda_a)] < k | Ho is true) = alpha
(ii) P([L(lambda_o)/L(lambda_a)] < k | Ho is true) = alpha
Which one is correct? I believe the answer to this is going to affect our answer above since the random variable is Possion which is DISCRETE.

Thanks a lot!

4. Originally Posted by kingwinner
Also, I checked a couple of sources, and the form of the lemma is slightly inconsistent:
(i) P([L(lambda_o)/L(lambda_a)] < k | Ho is true) = alpha
(ii) P([L(lambda_o)/L(lambda_a)] < k | Ho is true) = alpha
Which one is correct? I believe the answer to this is going to affect our answer above since the random variable is Possion which is DISCRETE.

Thanks a lot!
With continuous distributions it makes no difference, but here we have a discrete distribution so there may be a slight difference but probably not worth worrying about. Also most of the claculations will be identical with at most a minor difference at the last step.

CB

5. Originally Posted by kingwinner
The distribution of Z given H_o is Possion(n*lambda_o).
But how can we express the final answer? I can't think of a nice way of calculating it...there are infinitely many terms.
As n is an unknown constant it will appear as a parameter in the answer, which you will presumably have to express in terms of the cumulative Poisson distribution.

CB

6. Originally Posted by CaptainBlack
With continuous distributions it makes no difference, but here we have a discrete distribution so there may be a slight difference but probably not worth worrying about. Also most of the claculations will be identical with at most a minor difference at the last step.

CB
But I suppose there would be a "more correct" form of the lemma that works for both continuous and discrete cases. My textbook doesn't give a proof about this lemma, so I can't possibly see from the actual proof which form is the "correct" one. But if we work out the proof, which result would we have arrived at, (i) or (ii)?
[I know this might be a minor detail, but mathematically < and < just do not seem to be the same thing to me]

7. Originally Posted by CaptainBlack
As n is an unknown constant it will appear as a parameter in the answer, which you will presumably have to express in terms of the cumulative Poisson distribution.

CB
Is ln(k)/[ln(lambda_o/lambda_a)] an integer or not? I can think of 3 possible forms of the final answer, depending on the value of ln(k)/[ln(lambda_o/lambda_a)] :
(a) P(Z=ln(k)/[ln(lambda_o/lambda_a)])+P(Z=ln(k)/[ln(lambda_o/lambda_a)] +1)+P(Z=ln(k)/[ln(lambda_o/lambda_a)] +2)+...
(b) P(Z=ln(k)/[ln(lambda_o/lambda_a)] +1)+P(Z=ln(k)/[ln(lambda_o/lambda_a)] +2)+P(Z=ln(k)/[ln(lambda_o/lambda_a)] +3)+...
(c) P(Z=ln(k)/[ln(lambda_o/lambda_a)] +x)+P(Z=ln(k)/[ln(lambda_o/lambda_a)] +x+1)+P(Z=ln(k)/[ln(lambda_o/lambda_a)] +x+2)+... where 0<x<1 and it brings us to the smaller integer that is greater than ln(k)/[ln(lambda_o/lambda_a)]
Since we don't know the value of ln(k)/[ln(lambda_o/lambda_a)], I believe that we don't know whether it's an integer. In such a situation, should we choose (a), (b), or (c) as our final answer? Or is there a neater way to express it?

Thanks!

8. Originally Posted by kingwinner
Is ln(k)/[ln(lambda_o/lambda_a)] an integer or not? I can think of 3 possible forms of the final answer, depending on the value of ln(k)/[ln(lambda_o/lambda_a)] :
(a) P(Z=ln(k)/[ln(lambda_o/lambda_a)])+P(Z=ln(k)/[ln(lambda_o/lambda_a)] +1)+P(Z=ln(k)/[ln(lambda_o/lambda_a)] +2)+...
(b) P(Z=ln(k)/[ln(lambda_o/lambda_a)] +1)+P(Z=ln(k)/[ln(lambda_o/lambda_a)] +2)+P(Z=ln(k)/[ln(lambda_o/lambda_a)] +3)+...
(c) P(Z=ln(k)/[ln(lambda_o/lambda_a)] +x)+P(Z=ln(k)/[ln(lambda_o/lambda_a)] +x+1)+P(Z=ln(k)/[ln(lambda_o/lambda_a)] +x+2)+... where 0<x<1 and it brings us to the smaller integer that is greater than ln(k)/[ln(lambda_o/lambda_a)]
Since we don't know the value of ln(k)/[ln(lambda_o/lambda_a)], I believe that we don't know whether it's an integer. In such a situation, should we choose (a), (b), or (c) as our final answer? Or is there a neater way to express it?

Thanks!
It is very unlikely to be an integer, so treat is as though it is not.

CB

9. This is a related question:

In Neyman-Pearson lemma, for a fixed alpha, it gives the test with the highest power (most powerful test).

How about the "Likelihood Ratio Test"? (which is seemingly a generalization of the N-P lemma) What does it give? Does it also give the test with the highest power? Given any alpha, there are an infinite number of decision rules. What's so special about the decision rule given by the Likelihood Ratio Test?

Thank you for explaining!