# Thread: Prediction interval for normal distribution

1. ## Prediction interval for normal distribution

My textbook only has formulas for prediction intervals of t-distributed variables. If I have a binomial distribution, with an estimated p, how can I create a prediction interval that with 95% probability will contain the number of successes of the next 10000 trials? Thanks for any help with this!

2. with large n you need the CLT.
you need to tell me the TWO sample sizes.
You will take the difference between the two p hat's.
The mean is zero, since p-p=0
and you will just sum the two variances.
I've never seen this before, but it's just like the usual prediction situtaion.

3. All right, so n = 10000. I've estimated the p to be between 0,0015 and 0,0035. And what I want to do is to find a 95% confidence interval for the number of successes on the next sample of 10000 trials. Thanks for helping out.

4. Ok, let's get this straight.
A confidence interval is for estimating a parameter.
A prediction interval is for predicting a future random variable.

5. That is correct and I meant a prediction interval!

6. well I know I'm right
but you need to be clear on what you want.
The idea is the same as a prediction of a new observation.
You will pivot on the difference between two P hats and then apply the CLT.

$\displaystyle \hat P_n - \hat P_m \approx Normal$

where $\displaystyle E\bigl(\hat P_n - \hat P_m\bigr)=p-p=0$
since I assume we are sampling from the same underlying binomial distribution.
Otherwise this doesn't make sense.

And $\displaystyle V\bigl(\hat P_n- \hat P_m\bigr)=V\bigl(\hat P_n)+V\bigl(\hat P_m\bigr)$

$\displaystyle = {p(1-p)\over n}+{p(1-p)\over m}$

But here we have some options, since we don't know p, we need to use the p hat's but one
has been oberved while the other hasn't.
Using Slutsky's theorem I'm sure we can use the known one say, $\displaystyle \hat P_n$

So $\displaystyle {\hat P_n- \hat P_m\over\sqrt{\hat P_n(1- \hat P_n)\biggl({1\over n}+{1\over m}\biggr)}}\approx N(0,1)$

Now pivot and solve for $\displaystyle \hat P_m$ and then multiply by m to obtain the estimate of number of successes in this second future sample.

7. All right, so I use p=0,0025 as my p. But I'm not sure which formula to put it in. I know this will be a standard normal distribution so anything within 95% would mean a value between two standard deviations. If I knew the standard deviation I could just write

25 - 2 standard deviations < 25 < 25 + 2 standard deviations

I might be way off, but if I'm on to something, how do I find 1 standard deviation here?

8. you pivot and solve for the second p hat
you want 95 percent, so you place that mess I created between -1.96 and 1.96 and solve for p hat m.
Look at your book where they pivoted and solved for the parameter or future observation as I am doing.