Results 1 to 6 of 6

Math Help - Expected number of matches

  1. #1
    Newbie
    Joined
    Aug 2008
    Posts
    6

    Expected number of matches

    Hi all,

    I need to solve a mathematical problem for an Engineering purpose.
    Imagine I have a string of bits of length N.
    And I have a bit sequence pattern of length M.

    I would like to know what is the expected number of pattern matches in the string. You can assume 0 and 1's are equally likely and there bits are totally uncorrelated.

    An approximation would also help.

    Thanks!
    Follow Math Help Forum on Facebook and Google+

  2. #2
    Flow Master
    mr fantastic's Avatar
    Joined
    Dec 2007
    From
    Zeitgeist
    Posts
    16,948
    Thanks
    5
    Quote Originally Posted by mermeladeK View Post
    Hi all,

    I need to solve a mathematical problem for an Engineering purpose.
    Imagine I have a string of bits of length N.
    And I have a bit sequence pattern of length M.

    I would like to know what is the expected number of pattern matches in the string. You can assume 0 and 1's are equally likely and there bits are totally uncorrelated.

    An approximation would also help.

    Thanks!
    Move the bit sequence pattern along the string. There are n = M - (N - 1) = M - N + 1 positions. Test for a match at each position. The probability of a match at each position is p = \left( \frac{1}{2} \right)^M.

    Let X be the random variable number of matches. Then X ~ Binomial(n, p).

    So E(X) = np = (M - N + 1) \, \left( \frac{1}{2} \right)^M.
    Follow Math Help Forum on Facebook and Google+

  3. #3
    Newbie
    Joined
    Aug 2008
    Posts
    6
    Hi Mr. Fantastic.

    Thank you very much for the help though...

    Isn't what you tell me an upper level for E(matches) ?

    If you have in mind for example that a sequence has the pattern 10101010. Once there is a pattern found in the string. The probability of finding the same pattern just after it is 0, and not 0.5^M .

    So actually there is some sort of correlation to be concerned about.
    Last edited by mermeladeK; August 14th 2008 at 05:44 AM.
    Follow Math Help Forum on Facebook and Google+

  4. #4
    Super Member
    Joined
    Mar 2008
    Posts
    934
    Thanks
    33
    Awards
    1
    Quote Originally Posted by mermeladeK View Post
    Hi all,

    I need to solve a mathematical problem for an Engineering purpose.
    Imagine I have a string of bits of length N.
    And I have a bit sequence pattern of length M.

    I would like to know what is the expected number of pattern matches in the string. You can assume 0 and 1's are equally likely and there bits are totally uncorrelated.

    An approximation would also help.

    Thanks!
    Hi mermeladeK,

    Define
    X_i = <br />
\begin{cases}<br />
1    &\text{if the pattern matches the string at position i}\\<br />
0    &\text{otherwise}<br />
\end{cases}
    for i = 1, 2, \dots ,N-M+1.

    The probability of a match at position i is 2^{-M}, so
    P(X_i = 1) = 2^{-M}
    and
    E(X_i) = 2^{-M}.

    So the expected number of pattern matches is
    E(\sum_{i=1}^{N-M+1} X_i ) = \sum_{i=1}^{N-M+1} E(X_i) = (N-M+1)\, 2^{-M}.

    Here we have used the theorem that E(X+Y) = E(X) + E(Y). It's important to note that this theorem is true even if X and Y are not independent, which is good for us because the X_i's are not independent. (That's why the sum is not binomially distributed, as you noted previously.)

    This is all assuming that the string and pattern consist of independent random variables which are equally likely to be 0 or 1. If you have additional information, e.g. if you know that the pattern is 10101010 (as in your reply to Mr. Fantastic), that's a different problem.
    Follow Math Help Forum on Facebook and Google+

  5. #5
    Newbie
    Joined
    Aug 2008
    Posts
    6
    Hi awkward,

    Thanks for your helpful comment.
    Actually yes, the sequence patern I was looking at was 10001011 and 01110100 (inverse). That is to say if any of these paterns were
    found it would mean there is a match.

    After you clarifications I deduce that:
    E(X_i)!=2^-M when the conditional probabilites are taken in account. When there is no more probable pattern, the computation of E(X_i) still gives 2^-M.
    But with a specific case this might change, am I right?

    If that is it, I know how to compute it now.
    Follow Math Help Forum on Facebook and Google+

  6. #6
    Super Member
    Joined
    Mar 2008
    Posts
    934
    Thanks
    33
    Awards
    1
    Quote Originally Posted by mermeladeK View Post
    Hi awkward,

    Thanks for your helpful comment.
    Actually yes, the sequence patern I was looking at was 10001011 and 01110100 (inverse). That is to say if any of these paterns were
    found it would mean there is a match.

    After you clarifications I deduce that:
    E(X_i)!=2^-M when the conditional probabilites are taken in account. When there is no more probable pattern, the computation of E(X_i) still gives 2^-M.
    But with a specific case this might change, am I right?

    If that is it, I know how to compute it now.
    Well, I guess now that I've thought about it, the actual pattern is irrelevant to the final answer, since the probability of a match is still 2^{-M}.
    Follow Math Help Forum on Facebook and Google+

Similar Math Help Forum Discussions

  1. Expected number problem
    Posted in the Statistics Forum
    Replies: 5
    Last Post: August 12th 2011, 04:24 AM
  2. Expected number of bounces
    Posted in the Advanced Statistics Forum
    Replies: 5
    Last Post: November 30th 2009, 03:26 PM
  3. Expected number of collisions
    Posted in the Statistics Forum
    Replies: 0
    Last Post: November 29th 2009, 04:46 PM
  4. 10 football matches.......
    Posted in the Discrete Math Forum
    Replies: 3
    Last Post: October 30th 2009, 09:51 AM
  5. Matches
    Posted in the Discrete Math Forum
    Replies: 4
    Last Post: June 5th 2008, 12:08 AM

Search Tags


/mathhelpforum @mathhelpforum