1. ## Runs

Hi Members,
Let $X_1,.....X_n$be a sequence of independent binary random variables, with each $X_i$ being equal to 1 with probability $p$. A maximum consecutive subsequence of 1's is called a run. For instance,the sequence

1, 0,1,1,1,0,0,1,1,1,0
has 3 runs. With R equal to the number of runs, find E[R], and Var(R).
How to solve this question?

2. ## Re: Runs

do you mean your sequence has a maximum run length of 3?

3. ## Re: Runs

Originally Posted by romsek
do you mean your sequence has a maximum run length of 3?
No, Run length may exceed 3 also.

4. ## Re: Runs

Originally Posted by Vinod
No, Run length may exceed 3 also.
so you mean the number of occurrences of groups of 1 or more consecutive ones?

5. ## Re: Runs

Yes, that's correct.

6. ## Re: Runs

Originally Posted by Vinod
Yes, that's correct.
basically since every string has the same probability, $(1/2)^n$, this is just a very lengthy counting problem.

To get you and idea consider the number of ways to get 1 run.

There will be (n-k+1) ways to get a single length k run and so the total number of single runs is the sum of these which I believe is

$\dfrac {n^2+n} 2$.

The way to count the number of length 2 runs is so complicated it's hard to even come up with a method. Try to imagine how you would do it.

It just gets worse from there.

I don't have an answer for you but some quick googling shows that this is significant enough of a problem to have generated textbooks on it.

7. ## Re: Runs

Originally Posted by Vinod
Let $X_1,.....X_n$be a sequence of independent binary random variables, with each $X_i$ being equal to 1 with probability $p$. A maximum consecutive subsequence of 1's is called a run.
For instance,the sequence 1, 0,1,1,1,0,0,1,1,1,0 has 3 runs. With R equal to the number of runs, find E[R], and Var(R).

Originally Posted by romsek
basically since every string has the same probability, $\color{red}{(1/2)^n}$, this is just a very lengthy counting problem.
That is only true is $p=2^{-1}$

Lets consider the example: the sequence 1, 0,1,1,1,0,0,1,1,1,0 has 3 runs.
The string length is eleven is that particular string occurs with probability $p^7(1-p)^4$
In strings of length eleven, $R=0,~1,~\cdots,~6$

In a string of length $n$ then $R=0,~1,~\cdots,~\left\lceil {\dfrac{n}{2}} \right\rceil$

8. ## Re: Runs

Originally Posted by Plato
That is only true is $p=2^{-1}$

Lets consider the example: the sequence 1, 0,1,1,1,0,0,1,1,1,0 has 3 runs.
The string length is eleven is that particular string occurs with probability $p^7(1-p)^4$
In strings of length eleven, $R=0,~1,~\cdots,~6$

In a string of length $n$ then $R=0,~1,~\cdots,~\left\lceil {\dfrac{n}{2}} \right\rceil$
you're correct as usual.

After a multi-day dialog I forgot the original problem spec.

9. ## Re: Runs

Originally Posted by romsek
basically since every string has the same probability, $(1/2)^n$, this is just a very lengthy counting problem.

To get you and idea consider the number of ways to get 1 run.

There will be (n-k+1) ways to get a single length k run and so the total number of single runs is the sum of these which I believe is

$\dfrac {n^2+n} 2$.

The way to count the number of length 2 runs is so complicated it's hard to even come up with a method. Try to imagine how you would do it.

It just gets worse from there.

I don't have an answer for you but some quick googling shows that this is significant enough of a problem to have generated textbooks on it.
Hi romsek,
I am providing you the answer given by the author of the book for expected value of Runs.
If, for I = 1,...,n, we let $I_i$ equal 1 if a run begins at the data value $X_i$, and let it equal 0 otherwise, then

R = $\displaystyle\sum_{i=1}^n I_i$
Because
E[$I_i$] =P{$X_i$ =1} = p
E[$I_i$] =P{$X_i$ = 0,$X_i$ = 1} =p(1-p) for$i\geq 1$

It follows that

E[R] =$\displaystyle\sum_{i=1}^n E[I_i]$ = p+ ( n - 1) p(1 - p)

Did you get any clue to compute Var(R) ?

10. ## Re: Runs

Hi,
I'm quite certain the following method to compute the variance of R is correct, but I'm only about 90% confident in my algebra. For p=0 or p=1, R is constant. At least the formula show the variance is 0 in this case.

11. ## Re: Runs

Originally Posted by johng
Hi,
I'm quite certain the following method to compute the variance of R is correct, but I'm only about 90% confident in my algebra. For p=0 or p=1, R is constant. At least the formula show the variance is 0 in this case.

Hi,
Your method to compute the Var(R) is correct. But I have not checked out your algebra.While I start to do that, have a look at the answer provided by the author.
To compute Var(R) suppose that n $\geq$ 2 and again use the represeation of Equation R= $\displaystyle\sum_{i=1}^n I_i$ to obtain

Var(R)= $\displaystyle\sum_{i=1}^n Var(I_i) + 2 \sum_{j=2}^n\displaystyle\sum_{i
Because $I_i$ is a Bernoulli random variable, we have
$Var(I_i)=E[I_i](1-E[I_i])$
Further, it is easy to see that $I_i$ and $I_j$ are independent when i <j-1 implying that
Cov( $I_i,I_j$)=0, i< j-1
Moreover,as $I_{j-1} I_j$=0
$Cov(I_{j-1},I_j)=-E[I_{j-1}]E[I_j]=-E[I_{j-1}]p(1-p)$Can You explain this step?I didn't understand how does negative sign arise on R.H.S.?
Hence, when n $\geq$ 2 we obtain
Var(R)= $Var(I_1) +\displaystyle\sum_{j=2}^n Var(I_j) +2 Cov(I_1,I_2) + 2 \displaystyle\sum_{j=3}^n Cov(I_{j-1},I_j)$
=p(1-p)+(n-1)p(1-p)[1-p(1-p)] -2 $p^2$(1-p)- 2(n-2) $p^2(1-p)^2$Here, last two terms are negative, can you explain me, how?

12. ## Re: Runs

Hi,
Better solution than mine. What I was missing was the fact that $I_i$ and $I_j$ are independent for $i<j-1$. For your questions, note again $I_iI_{i+1}$ is identically 0. So $E(I_iI_{i+1})=0$ and so the covariance is $cov(I_iI_{i+1})=E(I_iI_{i+1})-E(I_i)E(I_{i+1})=-E(I_i)E(I_{i+1})<0$
This then makes the 3rd and 4th terms negative. Of course the complete variance is non-negative.