Results 1 to 7 of 7

Math Help - [SOLVED] Geometric distribution

  1. #1
    Moo
    Moo is offline
    A Cute Angle Moo's Avatar
    Joined
    Mar 2008
    From
    P(I'm here)=1/3, P(I'm there)=t+1/3
    Posts
    5,618
    Thanks
    6

    [SOLVED] Geometric distribution

    Hi !

    I'm doing an exercise, where I'm pretty stuck

    So here it is.

    Recall that if Y follows a geometric distribution (parameter p), then \mathbb{P}(Y=k)=(1-p)^{k-1}p, for k \geqslant 1.

    Let (X_k)_{k\geqslant 1} a sequence of rv independent and identically distributed : Bernoulli (p)
    For any integer n \geqslant 1, we define S_n=X_1+X_2+\dots+X_n
    And for any integer r \geqslant 1, we define T_r=\inf \{n ~:~ S_n=r\}


    1. Prove that T_r is a random variable. I'm lost on this one... But I still have to think a little more ><

    2. This question asked to prove that T_1 follows a geometric distribution, which was easy (I explained it with words)

    3. Show that for k \geqslant r, \mathbb{P}(T_r=k)=\binom{k-1}{r-1} p^r (1-p)^{k-r}
    So this is mainly where I'm stuck...
    It is asked to show a close reasoning. I don't know what it would change...

    My thought on this so far : this implies that X_k=1. But this is intuition (certainty that is not proved yet ).
    I've tried an induction on k, but I don't think it's the correct way.
    That's all I've got


    Thanks for any help (not necessarily solutions ) provided !

    EDIT: For question 3. :
    So by intuition and "reasoning", it is equivalent to proving that \mathbb{P}(T_r=k) equals the probability that X_k=1 and the sum of the first k-1 X_i equals r-1 (which is a binomial distribution)
    But how to prove it clearly ?
    Last edited by ThePerfectHacker; February 28th 2009 at 11:58 AM.
    Follow Math Help Forum on Facebook and Google+

  2. #2
    Global Moderator

    Joined
    Nov 2005
    From
    New York City
    Posts
    10,616
    Thanks
    10
    When it comes to probability theory I am an embarrasment. I want to try an answer and help you out if I could. But I really have no idea what I am doing. The only reason why I am responding is because my mathematical intutition tells me that what I wrote below is what you want. However, I really have absolutely no idea what identically independently distribution is, or any of that sort of stuff. So if I end up embarrasing myself then I promise never to post in the advanced probability section again .

    Quote Originally Posted by Moo View Post
    3. Show that for k \geqslant r, \mathbb{P}(T_r=k)=\binom{k-1}{r-1} p^r (1-p)^{k-r}
    So this is mainly where I'm stuck...
    It is asked to show a close reasoning. I don't know what it would change...
    I think this is referred to as the negative binomial.

    Among independent trials you want to compute the probability that after k (where k\geq r, r is a fixed number) trails you have got exactly r successes. Each success in this trial has probability p. Therefore, you want to show P(T_r = k) = {{k-1}\choose {r-1}}p^r (1-p)^{k-r}.

    In order for the r success to happen at the k trail it means there were r-1 success in the first n-1 trails. Ah, but that is the standard binomial distribution and so that gives us {{n-1}\choose {r-1}}p^{r-1}(1-p)^{n-r}. But you also want to have a success on the k trial, by independence, you need to multiply this expression by p again. And this gives you exactly what you want to prove.
    Follow Math Help Forum on Facebook and Google+

  3. #3
    MHF Contributor

    Joined
    Aug 2008
    From
    Paris, France
    Posts
    1,174
    Quote Originally Posted by Moo View Post
    Let (X_k)_{k\geqslant 1} a sequence of rv independent and identically distributed : Bernoulli (p)
    For any integer n \geqslant 1, we define S_n=X_1+X_2+\dots+X_n
    And for any integer r \geqslant 1, we define T_r=\inf \{n ~:~ S_n=r\}


    1. Prove that T_r is a random variable. I'm lost on this one... But I still have to think a little more ><
    Feel free to think a little more before you read the following:


    " T_r is a random variable" means that for any measurable subset A of \mathbb{N}^*, the subset \{T_r\in A\} of \{0,1\}^{\mathbb{N}^*} is measurable.
    The countable set \mathbb{N} is endowed (as always) with the discrete \sigma-algebra, hence it suffices to consider the subsets \{T_r=n\} for n\in\mathbb{N}^* (then any subset of \mathbb{N}^* is a countable union of subsets of the form \{T_r=n\} and therefore is measurable).
    The \sigma-algebra on \{0,1\}^{\mathbb{N}^*} is the product \sigma-algebra. It is generated by cylinder subsets, i.e. subsets defined by conditions relative to a finite number of coordinates. Since \{T_r=n\} depends only of the n first variables, it is a cylinder subset, and therefore it is measurable.


    3. Show that for k \geqslant r, \mathbb{P}(T_r=k)=\binom{k-1}{r-1} p^r (1-p)^{k-r}
    So this is mainly where I'm stuck...
    (...)
    EDIT: For question 3. :
    So by intuition and "reasoning", it is equivalent to proving that \mathbb{P}(T_r=k) equals the probability that X_k=1 and the sum of the first k-1 X_i equals r-1 (which is a binomial distribution)
    But how to prove it clearly ?
    ThePerfectHacker gave the right explanation. As an addendum: Sometimes, formulas look more convincing (and are more quickly read), so I'd suggest you write the explanation in words, and then you write a translation of it into symbols: \{T_r=k\}=\{S_{k-1}=r-1\}\cap\{X_k=1\} (because of the explanation in words) and the events on the right-hand side are independent (by an "independence by packets" argument), so that the probabilities multiply, and there you are.


    And indeed, this is a negative binomial distribution. It has always intrigued me that the negative binomial distribution relates to a problem similar to the geometric distribution, while the hypergeometric distribution is close to the binomial distribution... I guess the name comes from the binomial coefficient, but I've no clue about the "negativity".
    Follow Math Help Forum on Facebook and Google+

  4. #4
    Super Member
    Joined
    Mar 2008
    Posts
    934
    Thanks
    33
    Awards
    1
    Quote Originally Posted by Laurent View Post
    [snip]
    It has always intrigued me that the negative binomial distribution relates to a problem similar to the geometric distribution, while the hypergeometric distribution is close to the binomial distribution... I guess the name comes from the binomial coefficient, but I've no clue about the "negativity".
    The name derives from the binomial theorem with a negative exponent:

    (1 + x)^{-r} = \sum_{k=0}^{\infty}\binom{-r}{k} x^k = \sum_{k=0}^{\infty} (-1)^k \binom{k+r-1}{k} x^k .

    If we use the definition of the negative binomial distribution given here:
    Negative binomial distribution - Wikipedia, the free encyclopedia

    f(r) = \binom{k+r-1}{k} p^r (1-p)^k,

    then the coefficient \binom{k+r-1}{k} is equal to (-1)^k \binom{-r}{k}.
    Follow Math Help Forum on Facebook and Google+

  5. #5
    Moo
    Moo is offline
    A Cute Angle Moo's Avatar
    Joined
    Mar 2008
    From
    P(I'm here)=1/3, P(I'm there)=t+1/3
    Posts
    5,618
    Thanks
    6
    Looks weird to me that it's the negative binomial, since it is stated that k\geqslant r

    Among independent trials you want to compute the probability that after k (where k\geq r, r is a fixed number) trails you have got exactly r successes. Each success in this trial has probability p. Therefore, you want to show P(T_r = k) = {{k-1}\choose {r-1}}p^r (1-p)^{k-r}.
    The problem is... it's not "after k trails, you have got exactly r successes", because it may be : after k-1 trails, I have r successes, and the k-th one is a fail ? I think this is why I said it didn't look a negative binomial distribution
    But then I read in a book what it was : "the number of fails preceding the r-th success". Again, there's something going wrong here The mean of the geometric distribution is defined, in my problem, as 1/p. Whereas in the place where I read the definition, the mean of the negative binomial is \frac{r(1-p)}{p}. And it looks like that it's \frac rp here...

    I'm so confused lol! (though I had no difficulties finishing the problem :s) And I don't see either why there is "negative" in it...


    As for question 1., since S_n is the sum of random variables, it's a random variable (as the sum of measurable functions).
    And \inf \{ S_n=r\} is just a particular S_n, so it's a random variable...
    Is it correct ?
    Actually, I guess it's the inf that threw me off

    Laurent : Now I've read your explanation... what is a cylinder subset ??
    And lol... it's easier to consider the subsets of \mathbb{N}, rather than the ones of \mathbb{R}.
    Thanks for that !


    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Oh, another question...

    After that, I'm asked to show that P(T_{r+1}-T_r=i ~,~ T_r=j)=P(S_i=1 ~,~ S_{i-1}=0)P(S_j=r ~,~ S_{j-1}=r-1)
    So for this question, I "translated" the RHS into P(T_1=i)P(T_r=j)=(1-p)^{j-1} p \cdot P(T_r=j)
    And then I said that the LHS = P(T_{r+1}-T_r=i \mid T_r=j)P(T_r=j) =P(X_{j+1}=\dots=X_{j+i-1}=0~,~X_{i+j}=1)P(T_r=j)
    Then it was clear that LHS=RHS.
    However, is this a correct method ?


    Thereafter, I have to deduce that T_{r+1}-T_r and T_r are independent.... If they are, it is obvious that we get the previous formula. But how to prove they're independent ??



    Thank you for your patience
    Follow Math Help Forum on Facebook and Google+

  6. #6
    Super Member
    Joined
    Mar 2008
    Posts
    934
    Thanks
    33
    Awards
    1
    Quote Originally Posted by awkward View Post
    The name derives from the binomial theorem with a negative exponent:

    (1 + x)^{-r} = \sum_{k=0}^{\infty}\binom{-r}{k} x^k = \sum_{k=0}^{\infty} (-1)^k \binom{k+r-1}{k} x^k .

    If we use the definition of the negative binomial distribution given here:
    Negative binomial distribution - Wikipedia, the free encyclopedia

    f(r) = \binom{k+r-1}{k} p^r (1-p)^k,

    then the coefficient \binom{k+r-1}{k} is equal to (-1)^k \binom{-r}{k}.
    Correction: I should have written

    f(k) = \binom{k+r-1}{k} p^r (1-p)^k

    above (f(k) instead of f(r)).

    Moo-- By some definitions of the Negative Binomial distribution, k is the number of trials until r successes; by other definitions k is the number of trials after the first r until r successes. Wikipedia uses the second of these.

    The various definitions are a constant source of confusion.
    Follow Math Help Forum on Facebook and Google+

  7. #7
    MHF Contributor

    Joined
    Aug 2008
    From
    Paris, France
    Posts
    1,174
    Quote Originally Posted by Moo View Post
    Laurent : Now I've read your explanation... what is a cylinder subset ??
    It is the kind of subsets that is used to define the product \sigma-algebra \mathcal{A}^{\otimes \mathbb{N}} on X^{\mathbb{N}}, where (X,\mathcal{A}) is a measurable space. The cylinder subsets of X^{\mathbb{N}} are the subsets like \prod_{i=0}^n A_i \times X^{\mathbb{N}}=\{(x_i)_{i\geq 0}|x_0\in A_0,\, x_1\in A_1,\ldots x_n\in A_n\} or, in other words, the subsets defined by conditions on a finite number of coordinates. In the present case, X=\{0,1\}.


    After that, I'm asked to show that P(T_{r+1}-T_r=i ~,~ T_r=j)=P(S_i=1 ~,~ S_{i-1}=0)P(S_j=r ~,~ S_{j-1}=r-1)
    So for this question, I "translated" the RHS into P(T_1=i)P(T_r=j)=(1-p)^{j-1} p \cdot P(T_r=j)
    And then I said that the LHS = P(T_{r+1}-T_r=i \mid T_r=j)P(T_r=j) =P(X_{j+1}=\dots=X_{j+i-1}=0~,~X_{i+j}=1)P(T_r=j)
    Then it was clear that LHS=RHS.
    However, is this a correct method ?
    The equality with the conditional expectation is true but not clear; it almost assumes the next question as given. Instead, you can just describe the event and use independence (since \{T_r=j\} depends on X_1,\ldots,X_j\} only):

    P(T_r=j, T_{r+1}-T_r=i)=P(T_r=j, X_{j+1}=0, \ldots, X_{j+i-1}=0, X_{j+i}=1)= P(T_r=j)P(X_{j+1}=0, \ldots, X_{j+i-1}=0, X_{j+i}=1) =P(T_r=j)P(T_1=i)

    and the last equality is because (X_{j+k})_{k\geq 1} is distributed like (X_k)_{k\geq 1}.

    Thereafter, I have to deduce that T_{r+1}-T_r and T_r are independent.... If they are, it is obvious that we get the previous formula. But how to prove they're independent ??
    This is almost obvious from the previous formula. Indeed, the equality P(T_r=j,T_{r+1}-T_r=i)=P(T_r=j)P(T_1=i) shows that the distribution of the random variable (T_r,T_{r+1}-T_r) is a product distribution.

    You don't even have to say that T_{r+1}-T_r and T_1 have the same distribution; this comes as a by-product (anyway, it was almost proved in the previous question).
    Follow Math Help Forum on Facebook and Google+

Similar Math Help Forum Discussions

  1. geometric distribution...p.g.f
    Posted in the Advanced Statistics Forum
    Replies: 1
    Last Post: March 26th 2011, 06:45 AM
  2. Geometric distribution
    Posted in the Advanced Statistics Forum
    Replies: 0
    Last Post: February 8th 2010, 10:24 AM
  3. Geometric Distribution HELP!
    Posted in the Advanced Statistics Forum
    Replies: 0
    Last Post: November 15th 2009, 11:16 AM
  4. Geometric (p) Distribution on (0,1,2,...)
    Posted in the Advanced Statistics Forum
    Replies: 1
    Last Post: November 7th 2009, 07:32 PM
  5. Geometric distribution
    Posted in the Math Topics Forum
    Replies: 1
    Last Post: November 15th 2008, 02:19 PM

Search Tags


/mathhelpforum @mathhelpforum