# Thread: confidence population error rate is within a range

1. ## confidence population error rate is within a range

Hello. Would really appreciate some help answering the following question and making sure my though process even makes sense.

Context:
- I have a population of 4219 items.
- Some number > 0 of the items are calculated wrong
- 30 items were randomly selected and it was found 100% of them are correct
- there is no reason to assume that errors would be clustered

My questions:
- What is my current confidence that less than 1% of items contain an error based on my sample of 30 that didn't.
- What sample size would I need to be very confident (95%) the error rate is less than 1%? 5%?
- What sample size would be considered to accurately represent the overall population error rate?

I am really looking to minimize manual checking of the data as much as possible since it takes exceedingly long to do (5 minutes per item or more).

Appreciate any help.

2. ## Re: confidence population error rate is within a range

$\dfrac{P[\text{observe 0 defects in 30}|p<0.01]}{P[\text{observe 0 defects}]} =$

$\displaystyle \dfrac{\int_0^{0.01}(1-p)^{30}dp}{\int_0^1(1-p)^{30}dp} =0.2677$

So given the 30 no defect observations you made the probability that the population has under 1% defects is about 27%,
and about 63% probable that you'd see 0 defects in 30 with p>0.01

In other words the sample size is too small to support such a low specified defect rate
A sample size of say 100 brings the probability of $p<0.01$ up to 63%

A sample size of 500 brings the probability of $p<0.01$ above 99%

A sample size of 298 gets you to just above 95%

define "accurately"