I don't have access to the literature I used when taking a math stat course a couple of years ago so I thought I'd ask here. It's a fairly simple problem I reckon, but I obviously suck.
Let's say I have a population of randomly selected 2000 individuals (supposed to be a sample of a large population, like the population of a country) who have either shoplifted or not shoplifted at some point in their lives. By asking each of these 2000 honest individuals I discover that 100 of them have shoplifted. From the 100 shoplifters I then take a random sample of 30 observations and check their hair colour. 25 of those 30 are red-haired.
Obviously I can make a confidence interval for the amount of red-haired among the 100 who have shoplifted, but can I say something about the amount of red-haired people who have shoplifted in regards to the 2000 population or the population of the country in general? How would a confidence interval look for that problem?
Any help appreciated.
I would take 2 samples, one sample of those who had shoplifted and one that has not. Make it n=30 for each. Then look at the number of redheads in each group.
After this you can set up a confidence interval for the difference in proportions and test the hypothesis that the proprotion of redheads is different between the two groups.