# Thread: Probability of developing a lung cancer disease

1. ## Probability of developing a lung cancer disease

I am currently taking a Statistics course at university and here's the problem I need to solve.

A cohort study is done to assess the impact of smoking on lung cancer. In this study, a large representative sample of first year university students are followed for 40 years. For each student, the smoking status is recorded as well as the lung cancer status at the end of the follow-up period. In the cohort, 20% of students smoked and among those, 50%developed lung cancer. Among the remaining students, 6.25% developed lung cancer.

a) Compute the probability of developing a cancer.
b) Is developing lung cancer independent of smoking? why?
c) Using Bayes' rule, compute the conditional probability of smoking given the development of a lung cancer.

I don't know how to solve it... does anybody have an idea?
Thank you

2. ## Re: Probability of developing a lung cancer disease

Originally Posted by aptremblay
A cohort study is done to assess the impact of smoking on lung cancer. In this study, a large representative sample of first year university students are followed for 40 years. For each student, the smoking status is recorded as well as the lung cancer status at the end of the follow-up period. In the cohort, 20% of students smoked and among those, 50%developed lung cancer. Among the remaining students, 6.25% developed lung cancer.
a) Compute the probability of developing a cancer.
b) Is developing lung cancer independent of smoking? why?
c) Using Bayes' rule, compute the conditional probability of smoking given the development of a lung cancer.
Notation: $C$ is has cancer, $C^c$ complement, no cancer; $S$ smokes and $S^c$ does not smoke.

a) $\mathscr{P}(C)=\mathscr{P}(C\cap S)+\mathscr{P}(C\cap S^c)=\mathscr{P}(C|S)\mathscr{P}(S)+\mathscr{P}(C| S^c)\mathscr{P}(S^c)$

b) $\mathscr{P}(C)~=?, \mathscr{P}(S)~=?,~\&~\mathscr{P}(C\cap S)~=?$ So what is true?

c) $\mathscr{P}(S|C)=\dfrac{\mathscr{P}(S\cap C)}{\mathscr{P}(C)}$

3. ## Re: Probability of developing a lung cancer disease

Do you not know even the 'basics' of probability? If not where did you get this question? Suppose the total number of people involved was N. "20% of students smoked and among those, 50%developed lung cancer." So .2N smoked and .5 of them, .5(.2N)= .10N developed lung cancer. "Among the remaining students, 6.25% developed lung cancer." The "remaining" students is N- .8N= 0.2N and (.0625)(0.2N)= 0.0125N developed lung cancer. So out of N students, a total of .2N+ 0.0125N= 0.2125N developed cancer. The probability of developing lung cancer, whether one smokes or not is $\frac{0.2125N}{N}= 0.2125$ or 21.25%.

This study shows that if you smoke, the probability of developing lung cancer is 0.5 and if you do not it is 0.0625. What do you think it shows about the probability of developing lung cancer being independent of smoking or not?

Do you know what "Bayes rule" is? If not look it up!
Without directly using "Bayes rule", you know, from above, that, out of 0.2125N people who developed lung cancer, 0.10N of them smoked. The probability that a student smoked, given that they got lung cancer, is $\frac{0.10N}{0.2125N}= 0.47$ or 47%.

4. ## Re: Probability of developing a lung cancer disease

Originally Posted by HallsofIvy
Suppose the total number of people involved was N. "20% of students smoked and among those, 50%developed lung cancer." So .2N smoked and .5 of them, .5(.2N)= .10N developed lung cancer. "Among the remaining students, 6.25% developed lung cancer." The "remaining" students is N- .8N= 0.2N and (.0625)(0.2N)= 0.0125N developed lung cancer. So out of N students, a total of .2N+ 0.0125N= 0.2125N developed cancer. The probability of developing lung cancer, whether one smokes or not is $\frac{0.2125N}{N}= 0.2125$ or 21.25%
I get something different?
a) \begin{align*}\mathscr{P}(C)&=\mathscr{P}(C\cap S)+\mathscr{P}(C\cap S^c) \\&=\mathscr{P}(C|S)\mathscr{P}(S)+\mathscr{P}(C|S ^c)\mathscr{P}(S^c) \\&=(0.5)(0.2)+(0.0625)(0.8)\\&=0.15 \end{align*}
SEE HERE.
$\Large ~?~?~?$

5. ## Re: Probability of developing a lung cancer disease

You are right. I got myself confused (happens unfortunately often). Where I wrote "The "remaining" students is N- .8N= 0.2N" it should have been "The 'remaining students' is N- .2N= 0.8N."