do you mean your sequence has a maximum run length of 3?
Hi Members,
Let $ X_1,.....X_n $be a sequence of independent binary random variables, with each $ X_i $ being equal to 1 with probability $p$. A maximum consecutive subsequence of 1's is called a run. For instance,the sequence
1, 0,1,1,1,0,0,1,1,1,0
has 3 runs. With R equal to the number of runs, find E[R], and Var(R).
How to solve this question?
basically since every string has the same probability, $(1/2)^n$, this is just a very lengthy counting problem.
To get you and idea consider the number of ways to get 1 run.
There will be (n-k+1) ways to get a single length k run and so the total number of single runs is the sum of these which I believe is
$\dfrac {n^2+n} 2$.
The way to count the number of length 2 runs is so complicated it's hard to even come up with a method. Try to imagine how you would do it.
It just gets worse from there.
I don't have an answer for you but some quick googling shows that this is significant enough of a problem to have generated textbooks on it.
That is only true is $p=2^{-1}$
Lets consider the example: the sequence 1, 0,1,1,1,0,0,1,1,1,0 has 3 runs.
The string length is eleven is that particular string occurs with probability $p^7(1-p)^4$
In strings of length eleven, $R=0,~1,~\cdots,~6$
In a string of length $n$ then $R=0,~1,~\cdots,~\left\lceil {\dfrac{n}{2}} \right\rceil $
Hi romsek,
I am providing you the answer given by the author of the book for expected value of Runs.
If, for I = 1,...,n, we let $ I_i $ equal 1 if a run begins at the data value $X_i$, and let it equal 0 otherwise, then
R = $\displaystyle\sum_{i=1}^n I_i $
Because
E[$I_i$] =P{$X_i $ =1} = p
E[$I_i$] =P{$X_i$ = 0,$ X_i$ = 1} =p(1-p) for$ i\geq 1$
It follows that
E[R] =$ \displaystyle\sum_{i=1}^n E[I_i] $ = p+ ( n - 1) p(1 - p)
Did you get any clue to compute Var(R) ?
Hi,
Your method to compute the Var(R) is correct. But I have not checked out your algebra.While I start to do that, have a look at the answer provided by the author.
To compute Var(R) suppose that n 2 and again use the represeation of Equation R= to obtain
Var(R)=
Because is a Bernoulli random variable, we have
Further, it is easy to see that and are independent when i <j-1 implying that
Cov( )=0, i< j-1
Moreover,as =0
Can You explain this step?I didn't understand how does negative sign arise on R.H.S.?
Hence, when n 2 we obtain
Var(R)=
=p(1-p)+(n-1)p(1-p)[1-p(1-p)] -2 (1-p)- 2(n-2) Here, last two terms are negative, can you explain me, how?
Hi,
Better solution than mine. What I was missing was the fact that $I_i$ and $I_j$ are independent for $i<j-1$. For your questions, note again $I_iI_{i+1}$ is identically 0. So $E(I_iI_{i+1})=0$ and so the covariance is $cov(I_iI_{i+1})=E(I_iI_{i+1})-E(I_i)E(I_{i+1})=-E(I_i)E(I_{i+1})<0$
This then makes the 3rd and 4th terms negative. Of course the complete variance is non-negative.