Results 1 to 8 of 8

Math Help - Central Limit Theorem.

  1. #1
    Junior Member
    Joined
    Nov 2008
    Posts
    45

    Central Limit Theorem.

    Another one of those review exercises questions!

    A box contains an unknown number of white and black balls. We wish to estimate the proportion p of white balls in the box. To do so, we draw n successive balls with replacement. Let Z_n be the proportion of white balls obtained after n drawings.

    (i) Show that for all \epsilon > 0,
    \mathbb P (|Z_{n} - p| \geq \epsilon) \leq \frac{1}{4n \epsilon^2}
    (ii) Using the result in part (i), find the smallest value of n such that with probability greater than 0.95, the proportion Z_{n} in the sample will estimate p within 0.1.

    (iii) Same question as in (ii) using the central limit theorem.

    Thank you again guys! Been great help, really!
    Follow Math Help Forum on Facebook and Google+

  2. #2
    Moo
    Moo is offline
    A Cute Angle Moo's Avatar
    Joined
    Mar 2008
    From
    P(I'm here)=1/3, P(I'm there)=t+1/3
    Posts
    5,618
    Thanks
    6
    Hello,
    Quote Originally Posted by panda* View Post
    Another one of those review exercises questions!

    A box contains an unknown number of white and black balls. We wish to estimate the proportion p of white balls in the box. To do so, we draw n successive balls with replacement. Let Z_n be the proportion of white balls obtained after n drawings.

    (i) Show that for all \epsilon > 0,
    \mathbb P (|Z_{n} - p| \geq \epsilon) \leq \frac{1}{4n \epsilon^2}
    Use Chebyshev's inequality.
    To compute the variance of Z_n, consider the rv X_i, which equals 1 if you get a white ball at the i-th drawing, 0 otherwise.
    You can see that the X_i are independent and that Z_n=\frac 1n \sum_{i=1}^n X_i
    So now it's easy to find the variance.
    If you can't, just ask.

    (ii) Using the result in part (i), find the smallest value of n such that with probability greater than 0.95, the proportion Z_{n} in the sample will estimate p within 0.1.
    Take the complementary of the above inequality. It may help you see where you're going :
    \mathbb{P}(|Z_n-p|<\epsilon)\geq 1-\frac{1}{4n\epsilon^2}

    "will estimate p within 0.1" means that we let \epsilon=0.1
    Then, find the smallest n such that 1-\frac{1}{4n\epsilon^2}\geq 0.95, that is \frac{1}{4n\epsilon^2}\leq 0.05

    (iii) Same question as in (ii) using the central limit theorem.
    What does the central limit theorem says ?
    It should be easy with the way I defined Z_n, isn't it ?

    Thank you again guys! Been great help, really!
    Guys, guys... What about cows ???

    Glad you appreciate it though.
    Follow Math Help Forum on Facebook and Google+

  3. #3
    Junior Member
    Joined
    Nov 2008
    Posts
    45
    Hello Moo! I tried on the question with your advise and this is what I got so far! I am not sure if I am on the right track, but correct me if I am wrong! (:

    For part (i)

    According to the Chebyshev's inequality, if X has mean \mu and variance, \sigma^2, then,
    \mathbb P (|X - \mu| \geq k\sigma) \leq \frac{1}{k^2}
    From the question,
    \mathbb P(|\mathbb Z_n - p| \geq \epsilon) \leq \frac{1}{4n \epsilon^2}

    \Rightarrow \mu = p<br />
    k = 2 \sqrt n \epsilon<br />
    \sigma = \frac{2}{2\sqrt n}

    \sigma^2 = E[(z_n - p)^2]
    = \int^\infty_{-\infty} (z_n - p)^2 f(x) dx
    \geq \int_{|z_n - p| \geq \epsilon} (z_n - p)^2 f(x) dx
    \geq (2 \sqrt n \epsilon)^2(\sigma^2) \int_{|z_n - p| \geq \epsilon} f(x) dx
    \geq 4n\epsilon^2(\sigma^2) \int_{|z_n - p| \geq \epsilon} f(x)
    \geq 2n \epsilon^2 (\sigma^2) \mathbb P(Z_n - p) \geq \epsilon)

    By dividing both sides with \sigma^2,
    \Rightarrow \mathbb P(|Z_n - p| \geq \epsilon) \leq \frac{1}{4n\epsilon^2}
    For part (ii),
    \mathbb P(|Z_n - P) \leq \epsilon)
    \Rightarrow \mathbb P(|Z_n - p| \leq \epsilon) \geq 1 - \frac{1}{4n\epsilon^2}
    \Rightarrow 1 - \frac{1}{4n\epsilon^2} \geq 0.95
    \Rightarrow \frac{1}{4n\epsilon^2} \leq 0.05
    \Rightarrow 4n\epsilon^2 \geq 20
    By estimating p within 0.1 means we let \epsilon = 0.01,
    \Rightarrow 4n(0.01)^2 \geq 20
    \Rightarrow 4n \geq 200000
    \Rightarrow n \geq 50000
    \Rightarrow n = 50001.
    Am I doing alright so far?

    Hm, Central Limit Theorem says ...

    Let X_1, X_2, ... be indepedent, identically distributed random variables with E(X_i) = \mu and V(X_i) = \sigma^2, and let S_n = X_1 + X_2 + ... + X_n. Then,
    Z_n = \frac {S_n - n\mu}{\sigma \sqrt n} \rightarrow N(0,1), as n \rightarrow \infty.
    Not really sure how to proceed from here!
    Follow Math Help Forum on Facebook and Google+

  4. #4
    Moo
    Moo is offline
    A Cute Angle Moo's Avatar
    Joined
    Mar 2008
    From
    P(I'm here)=1/3, P(I'm there)=t+1/3
    Posts
    5,618
    Thanks
    6
    Hi !
    Quote Originally Posted by panda* View Post
    Hello Moo! I tried on the question with your advise and this is what I got so far! I am not sure if I am on the right track, but correct me if I am wrong! (:

    For part (i)

    According to the Chebyshev's inequality, if X has mean \mu and variance, \sigma^2, then,
    \mathbb P (|X - \mu| \geq k\sigma) \leq \frac{1}{k^2}
    From the question,
    \mathbb P(|\mathbb Z_n - p| \geq \epsilon) \leq \frac{1}{4n \epsilon^2}

    \Rightarrow \mu = p
    k = 2 \sqrt n \epsilon
    \sigma = \frac{2}{2\sqrt n}
    Okay, I had some problems with this part, but while quoting, I saw what you mean. You can't make spaces with the latex.

    \sigma^2 = E[(z_n - p)^2]
    = \int^\infty_{-\infty} (z_n - p)^2 f(x) dx
    \geq \int_{|z_n - p| \geq \epsilon} (z_n - p)^2 f(x) dx
    \geq (2 \sqrt n \epsilon)^2(\sigma^2) \int_{|z_n - p| \geq \epsilon} f(x) dx
    \geq 4n\epsilon^2(\sigma^2) \int_{|z_n - p| \geq \epsilon} f(x)
    \geq 2n \epsilon^2 (\sigma^2) \mathbb P(Z_n - p) \geq \epsilon)

    By dividing both sides with \sigma^2,
    \Rightarrow \mathbb P(|Z_n - p| \geq \epsilon) \leq \frac{1}{4n\epsilon^2}
    But in these two previous quotes, I have to say that I don't agree... Or maybe I just didn't understand what you did (especially for the integrals)

    You want to prove that it's \leq \frac{1}{4n\epsilon^2}, you're not asked to identify every element of it o.O

    Anyway, use this version of Chebyshev's inequality :

    \mathbb{P}(|Z_n-\mathbb{E}(Z_n)|>\epsilon)\leq \frac{\text{Var}(Z_n)}{\epsilon^2}

    \mathbb{E}(Z_n)=\mathbb{E}\left(\frac 1n\sum_{i=1}^n X_i\right)=\frac 1n \sum_{i=1}^n \mathbb{E}(X_i)

    But as I wrote before (or at least what I meant), the X_i follow a Bernoulli distribution, with parameter p.

    This means that \sum_{i=1}^n\mathbb{E}(X_i)=n\mathbb{E}(X_1)=np

    So indeed, \mathbb{E}(Z_n)=p

    From Chebyshev's inequality, we can say that \mathbb{P}(|Z_n-p|>\epsilon)\leq \frac{\text{Var}(Z_n)}{\epsilon^2}

    Now, what is \text{Var}(Z_n) ?

    \text{var}(Z_n)=\text{Var}\left(\frac 1n\sum_{i=1}^n X_i\right)=\frac{1}{n^2}\text{Var}\left(\sum_{i=1}  ^n X_i\right)

    Since the Xi are independent, and identically distributed, we have :
    \text{Var}(Z_n)=\frac{1}{n^2}\cdot\left(n \text{Var}(X_1)\right)=\frac{pq}{n}

    where q=1-p. Because the variance of a Bernoulli distribution is pq.

    Now, note that \forall x\in[0,1] ~,~ x(1-x)\leq \frac 14

    From here, the inequality you're looking for just appears !


    For part (ii),
    \mathbb P(|Z_n - P) \leq \epsilon)
    \Rightarrow \mathbb P(|Z_n - p| \leq \epsilon) \geq 1 - \frac{1}{4n\epsilon^2}
    \Rightarrow 1 - \frac{1}{4n\epsilon^2} \geq 0.95
    \Rightarrow \frac{1}{4n\epsilon^2} \leq 0.05
    \Rightarrow 4n\epsilon^2 \geq 20
    By estimating p within 0.1 means we let \epsilon = 0.01,
    \Rightarrow 4n(0.01)^2 \geq 20
    \Rightarrow 4n \geq 200000
    \Rightarrow n \geq 50000
    \Rightarrow n = 50001.
    Am I doing alright so far?
    Yes, except that it should be \epsilon=0.1 isn't it ?
    And n\geq 50000 --> you can take n=50000.


    Hm, Central Limit Theorem says ...

    Let X_1, X_2, ... be indepedent, identically distributed random variables with E(X_i) = \mu and V(X_i) = \sigma^2, and let S_n = X_1 + X_2 + ... + X_n. Then,
    Z_n = \frac {S_n - n\mu}{\sigma \sqrt n} \rightarrow N(0,1), as n \rightarrow \infty.
    Not really sure how to proceed from here!
    You can see that Z_n is in the same form as S_n and that it satisfies all the required conditions.

    Thus \frac{X_1+\dots+X_n-np}{\sigma\sqrt{n}}=\sqrt{n}\cdot\frac{Z_n-p}{\sigma} converges to the std normal distribution. (in probability)

    But this means that the cumulative density function of \sqrt{n}\cdot\frac{Z_n-p}{\sigma} converges to the cumulative density function of the std normal distribution.

    So \mathbb{P}\left(\sqrt{n}\cdot\frac{Z_n-p}{\sigma} \in[a,b]\right)\xrightarrow[]{n\to\infty} \int_a^b \frac{1}{\sqrt{2\pi}}\cdot e^{-t^2}{2} ~dt

    Does that help you ? Have you ever heard of confidence intervals ?


    Sorry it's a bit long...
    Follow Math Help Forum on Facebook and Google+

  5. #5
    Junior Member
    Joined
    Nov 2008
    Posts
    45
    It's okay Moo! Long is good, it just means its more detailed. So part (iii) of this question is just asking me to show whatever you have showed me above? Is that the solution that they are asking more? Thank you again though!
    Follow Math Help Forum on Facebook and Google+

  6. #6
    Moo
    Moo is offline
    A Cute Angle Moo's Avatar
    Joined
    Mar 2008
    From
    P(I'm here)=1/3, P(I'm there)=t+1/3
    Posts
    5,618
    Thanks
    6
    Quote Originally Posted by panda* View Post
    It's okay Moo! Long is good, it just means its more detailed. So part (iii) of this question is just asking me to show whatever you have showed me above? Is that the solution that they are asking more? Thank you again though!
    Hmm I don't get your question.. ?
    Sorry, I'm a bit slow sometimes

    For part iii), use the fact that for a large n, \mathbb{P}\left(\left|\sqrt{n}\cdot\frac{Z_n-p}{\sigma}\right|\geq\epsilon\right) \approx 1-\int_{-\epsilon}^\epsilon \frac{1}{\sqrt{2\pi}}\cdot e^{-t^2/2} ~dt

    (use a z-table)
    Follow Math Help Forum on Facebook and Google+

  7. #7
    Junior Member
    Joined
    Nov 2008
    Posts
    45
    Oh! So we just have to evaluate the integral and subbing the boundaries of \epsilon = 0.1?

    Anyway, as for the integral proof I used earlier to prove the inequality was provided in my lecture notes for a continuous random variable where, \sigma^2 = E[(X-\mu)^2]. Is it better to use the way you suggested or would the way I used work too?
    Follow Math Help Forum on Facebook and Google+

  8. #8
    Junior Member
    Joined
    Nov 2008
    Posts
    45
    Regarding your proof, I tried it out, there are couple of places where I don't understand how did we get there, for example,

    How did you get like,

    \mathbb E(Z_n) = \mathbb E(\frac{1}{n} \sum^n_{i=1} X_i) and Var(Z_n) = \frac{1}{n^2}(n.Var(X_i))?

    Also, what is the relation for \forall x \in [0,1], x(1-x) \leq \frac{1}{4} to the final step?

    And I still don't know how to do part (iii)!

    Thank you for your time again!
    Follow Math Help Forum on Facebook and Google+

Similar Math Help Forum Discussions

  1. Central Limit Theorem
    Posted in the Advanced Statistics Forum
    Replies: 0
    Last Post: July 21st 2010, 03:27 PM
  2. Central Limit Theorem
    Posted in the Advanced Statistics Forum
    Replies: 2
    Last Post: May 17th 2010, 05:42 PM
  3. Central Limit Theorem
    Posted in the Advanced Statistics Forum
    Replies: 1
    Last Post: October 3rd 2008, 11:36 PM
  4. Central Limit Theorem
    Posted in the Advanced Statistics Forum
    Replies: 1
    Last Post: February 18th 2008, 08:49 PM
  5. Central Limit Theorem
    Posted in the Advanced Statistics Forum
    Replies: 5
    Last Post: October 21st 2007, 02:01 PM

Search Tags


/mathhelpforum @mathhelpforum