1. [SOLVED] Geometric distribution

Hi !

I'm doing an exercise, where I'm pretty stuck

So here it is.

Recall that if Y follows a geometric distribution (parameter p), then $\mathbb{P}(Y=k)=(1-p)^{k-1}p$, for $k \geqslant 1$.

Let $(X_k)_{k\geqslant 1}$ a sequence of rv independent and identically distributed : Bernoulli (p)
For any integer $n \geqslant 1$, we define $S_n=X_1+X_2+\dots+X_n$
And for any integer $r \geqslant 1$, we define $T_r=\inf \{n ~:~ S_n=r\}$

1. Prove that $T_r$ is a random variable. I'm lost on this one... But I still have to think a little more ><

2. This question asked to prove that $T_1$ follows a geometric distribution, which was easy (I explained it with words)

3. Show that for $k \geqslant r$, $\mathbb{P}(T_r=k)=\binom{k-1}{r-1} p^r (1-p)^{k-r}$
So this is mainly where I'm stuck...
It is asked to show a close reasoning. I don't know what it would change...

My thought on this so far : this implies that $X_k=1$. But this is intuition (certainty that is not proved yet ).
I've tried an induction on k, but I don't think it's the correct way.
That's all I've got

Thanks for any help (not necessarily solutions ) provided !

EDIT: For question 3. :
So by intuition and "reasoning", it is equivalent to proving that $\mathbb{P}(T_r=k)$ equals the probability that $X_k=1$ and the sum of the first k-1 $X_i$ equals r-1 (which is a binomial distribution)
But how to prove it clearly ?

2. When it comes to probability theory I am an embarrasment. I want to try an answer and help you out if I could. But I really have no idea what I am doing. The only reason why I am responding is because my mathematical intutition tells me that what I wrote below is what you want. However, I really have absolutely no idea what identically independently distribution is, or any of that sort of stuff. So if I end up embarrasing myself then I promise never to post in the advanced probability section again .

Originally Posted by Moo
3. Show that for $k \geqslant r$, $\mathbb{P}(T_r=k)=\binom{k-1}{r-1} p^r (1-p)^{k-r}$
So this is mainly where I'm stuck...
It is asked to show a close reasoning. I don't know what it would change...
I think this is referred to as the negative binomial.

Among independent trials you want to compute the probability that after $k$ (where $k\geq r$, $r$ is a fixed number) trails you have got exactly $r$ successes. Each success in this trial has probability $p$. Therefore, you want to show $P(T_r = k) = {{k-1}\choose {r-1}}p^r (1-p)^{k-r}$.

In order for the $r$ success to happen at the $k$ trail it means there were $r-1$ success in the first $n-1$ trails. Ah, but that is the standard binomial distribution and so that gives us ${{n-1}\choose {r-1}}p^{r-1}(1-p)^{n-r}$. But you also want to have a success on the $k$ trial, by independence, you need to multiply this expression by $p$ again. And this gives you exactly what you want to prove.

3. Originally Posted by Moo
Let $(X_k)_{k\geqslant 1}$ a sequence of rv independent and identically distributed : Bernoulli (p)
For any integer $n \geqslant 1$, we define $S_n=X_1+X_2+\dots+X_n$
And for any integer $r \geqslant 1$, we define $T_r=\inf \{n ~:~ S_n=r\}$

1. Prove that $T_r$ is a random variable. I'm lost on this one... But I still have to think a little more ><
Feel free to think a little more before you read the following:

" $T_r$ is a random variable" means that for any measurable subset $A$ of $\mathbb{N}^*$, the subset $\{T_r\in A\}$ of $\{0,1\}^{\mathbb{N}^*}$ is measurable.
The countable set $\mathbb{N}$ is endowed (as always) with the discrete $\sigma$-algebra, hence it suffices to consider the subsets $\{T_r=n\}$ for $n\in\mathbb{N}^*$ (then any subset of $\mathbb{N}^*$ is a countable union of subsets of the form $\{T_r=n\}$ and therefore is measurable).
The $\sigma$-algebra on $\{0,1\}^{\mathbb{N}^*}$ is the product $\sigma$-algebra. It is generated by cylinder subsets, i.e. subsets defined by conditions relative to a finite number of coordinates. Since $\{T_r=n\}$ depends only of the $n$ first variables, it is a cylinder subset, and therefore it is measurable.

3. Show that for $k \geqslant r$, $\mathbb{P}(T_r=k)=\binom{k-1}{r-1} p^r (1-p)^{k-r}$
So this is mainly where I'm stuck...
(...)
EDIT: For question 3. :
So by intuition and "reasoning", it is equivalent to proving that $\mathbb{P}(T_r=k)$ equals the probability that $X_k=1$ and the sum of the first k-1 $X_i$ equals r-1 (which is a binomial distribution)
But how to prove it clearly ?
ThePerfectHacker gave the right explanation. As an addendum: Sometimes, formulas look more convincing (and are more quickly read), so I'd suggest you write the explanation in words, and then you write a translation of it into symbols: $\{T_r=k\}=\{S_{k-1}=r-1\}\cap\{X_k=1\}$ (because of the explanation in words) and the events on the right-hand side are independent (by an "independence by packets" argument), so that the probabilities multiply, and there you are.

And indeed, this is a negative binomial distribution. It has always intrigued me that the negative binomial distribution relates to a problem similar to the geometric distribution, while the hypergeometric distribution is close to the binomial distribution... I guess the name comes from the binomial coefficient, but I've no clue about the "negativity".

4. Originally Posted by Laurent
[snip]
It has always intrigued me that the negative binomial distribution relates to a problem similar to the geometric distribution, while the hypergeometric distribution is close to the binomial distribution... I guess the name comes from the binomial coefficient, but I've no clue about the "negativity".
The name derives from the binomial theorem with a negative exponent:

$(1 + x)^{-r} = \sum_{k=0}^{\infty}\binom{-r}{k} x^k = \sum_{k=0}^{\infty} (-1)^k \binom{k+r-1}{k} x^k$.

If we use the definition of the negative binomial distribution given here:
Negative binomial distribution - Wikipedia, the free encyclopedia

$f(r) = \binom{k+r-1}{k} p^r (1-p)^k$,

then the coefficient $\binom{k+r-1}{k}$ is equal to $(-1)^k \binom{-r}{k}$.

5. Looks weird to me that it's the negative binomial, since it is stated that $k\geqslant r$

Among independent trials you want to compute the probability that after k (where k\geq r, r is a fixed number) trails you have got exactly r successes. Each success in this trial has probability p. Therefore, you want to show $P(T_r = k) = {{k-1}\choose {r-1}}p^r (1-p)^{k-r}.$
The problem is... it's not "after k trails, you have got exactly r successes", because it may be : after k-1 trails, I have r successes, and the k-th one is a fail ? I think this is why I said it didn't look a negative binomial distribution
But then I read in a book what it was : "the number of fails preceding the r-th success". Again, there's something going wrong here The mean of the geometric distribution is defined, in my problem, as 1/p. Whereas in the place where I read the definition, the mean of the negative binomial is $\frac{r(1-p)}{p}$. And it looks like that it's $\frac rp$ here...

I'm so confused lol! (though I had no difficulties finishing the problem :s) And I don't see either why there is "negative" in it...

As for question 1., since $S_n$ is the sum of random variables, it's a random variable (as the sum of measurable functions).
And $\inf \{ S_n=r\}$ is just a particular $S_n$, so it's a random variable...
Is it correct ?
Actually, I guess it's the inf that threw me off

Laurent : Now I've read your explanation... what is a cylinder subset ??
And lol... it's easier to consider the subsets of $\mathbb{N}$, rather than the ones of $\mathbb{R}$.
Thanks for that !

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Oh, another question...

After that, I'm asked to show that $P(T_{r+1}-T_r=i ~,~ T_r=j)=P(S_i=1 ~,~ S_{i-1}=0)P(S_j=r ~,~ S_{j-1}=r-1)$
So for this question, I "translated" the RHS into $P(T_1=i)P(T_r=j)=(1-p)^{j-1} p \cdot P(T_r=j)$
And then I said that the LHS = $P(T_{r+1}-T_r=i \mid T_r=j)P(T_r=j)$ $=P(X_{j+1}=\dots=X_{j+i-1}=0~,~X_{i+j}=1)P(T_r=j)$
Then it was clear that LHS=RHS.
However, is this a correct method ?

Thereafter, I have to deduce that $T_{r+1}-T_r$ and $T_r$ are independent.... If they are, it is obvious that we get the previous formula. But how to prove they're independent ??

6. Originally Posted by awkward
The name derives from the binomial theorem with a negative exponent:

$(1 + x)^{-r} = \sum_{k=0}^{\infty}\binom{-r}{k} x^k = \sum_{k=0}^{\infty} (-1)^k \binom{k+r-1}{k} x^k$.

If we use the definition of the negative binomial distribution given here:
Negative binomial distribution - Wikipedia, the free encyclopedia

$f(r) = \binom{k+r-1}{k} p^r (1-p)^k$,

then the coefficient $\binom{k+r-1}{k}$ is equal to $(-1)^k \binom{-r}{k}$.
Correction: I should have written

$f(k) = \binom{k+r-1}{k} p^r (1-p)^k$

Moo-- By some definitions of the Negative Binomial distribution, k is the number of trials until r successes; by other definitions k is the number of trials after the first r until r successes. Wikipedia uses the second of these.

The various definitions are a constant source of confusion.

7. Originally Posted by Moo
Laurent : Now I've read your explanation... what is a cylinder subset ??
It is the kind of subsets that is used to define the product $\sigma$-algebra $\mathcal{A}^{\otimes \mathbb{N}}$ on $X^{\mathbb{N}}$, where $(X,\mathcal{A})$ is a measurable space. The cylinder subsets of $X^{\mathbb{N}}$ are the subsets like $\prod_{i=0}^n A_i \times X^{\mathbb{N}}=\{(x_i)_{i\geq 0}|x_0\in A_0,\, x_1\in A_1,\ldots x_n\in A_n\}$ or, in other words, the subsets defined by conditions on a finite number of coordinates. In the present case, $X=\{0,1\}$.

After that, I'm asked to show that $P(T_{r+1}-T_r=i ~,~ T_r=j)=P(S_i=1 ~,~ S_{i-1}=0)P(S_j=r ~,~ S_{j-1}=r-1)$
So for this question, I "translated" the RHS into $P(T_1=i)P(T_r=j)=(1-p)^{j-1} p \cdot P(T_r=j)$
And then I said that the LHS = $P(T_{r+1}-T_r=i \mid T_r=j)P(T_r=j)$ $=P(X_{j+1}=\dots=X_{j+i-1}=0~,~X_{i+j}=1)P(T_r=j)$
Then it was clear that LHS=RHS.
However, is this a correct method ?
The equality with the conditional expectation is true but not clear; it almost assumes the next question as given. Instead, you can just describe the event and use independence (since $\{T_r=j\}$ depends on $X_1,\ldots,X_j\}$ only):

$P(T_r=j, T_{r+1}-T_r=i)=P(T_r=j, X_{j+1}=0, \ldots, X_{j+i-1}=0, X_{j+i}=1)=$ $P(T_r=j)P(X_{j+1}=0, \ldots, X_{j+i-1}=0, X_{j+i}=1)$ $=P(T_r=j)P(T_1=i)$

and the last equality is because $(X_{j+k})_{k\geq 1}$ is distributed like $(X_k)_{k\geq 1}$.

Thereafter, I have to deduce that $T_{r+1}-T_r$ and $T_r$ are independent.... If they are, it is obvious that we get the previous formula. But how to prove they're independent ??
This is almost obvious from the previous formula. Indeed, the equality $P(T_r=j,T_{r+1}-T_r=i)=P(T_r=j)P(T_1=i)$ shows that the distribution of the random variable $(T_r,T_{r+1}-T_r)$ is a product distribution.

You don't even have to say that $T_{r+1}-T_r$ and $T_1$ have the same distribution; this comes as a by-product (anyway, it was almost proved in the previous question).