Results 1 to 4 of 4

Math Help - Sequence Probability

  1. #1
    Newbie cryptocrow's Avatar
    Joined
    Jun 2008
    Posts
    11

    Sequence Probability

    Hi,

    I would like to compute the probability of the following problem...

    Variables:
    + a sequence defined by an alphabet of four characters
    + a specific short sequence s1
    + a specific short sequence s2
    + a gap of non-specific characters between s1 and s2 of length g

    For example:
    ...230121032032013220312300132010...
    + s1 = 3203 (the first bold sequence)
    + s2 = 3123 (the second bold sequence)
    + g = 7 (the length of the underlined sequence = gap)

    What is the probability that s1 and s2 occur with a gap that does not exceed 100?
    Someone suggested using the Poisson distribution, but I'm not sure...
    Last edited by cryptocrow; April 6th 2009 at 05:45 PM.
    Follow Math Help Forum on Facebook and Google+

  2. #2
    Newbie cryptocrow's Avatar
    Joined
    Jun 2008
    Posts
    11
    As in the example, if s1 or s2 has length n, then the probability of a randomly generated s1 is 4^-n correct? I'm just not sure how to take into account the gap of characters between s1 and s2...
    Follow Math Help Forum on Facebook and Google+

  3. #3
    MHF Contributor

    Joined
    Aug 2008
    From
    Paris, France
    Posts
    1,174
    Something's unclear in your problem: what sequence do you consider? Is it finite or infinite? What length? Do you consider the first occurence of s1 (followed by a gap, and then s2), or just any one? or all?

    Like, if there is s1, a large gap, then s2, then s1, a small gap and finally s2, do you consider it is ok?
    Follow Math Help Forum on Facebook and Google+

  4. #4
    Newbie cryptocrow's Avatar
    Joined
    Jun 2008
    Posts
    11

    Question

    The random sequence of characters is finite and of a given length. For example, the sequence may be 3,000,000 characters long.

    Let an occurrence be defined as ...s1 then gap then s2... where 0 <= gap length <= 100. So you are quite correct in your definition.

    Alternatively:
    + Let the string of characters S = s1 + gap + s2.

    Given:
    + s1 = 3202
    + s2 = 3123
    + gap = any sequence of characters (0 <= length of gap <= 100)
    + then S = s1 + gap + s2
    + therefore the following are possible S sequences:
    - ...3203031221013120101231323133210123103123...
    - ...32032012133101001102330101203123123...
    - ...320303123...
    - ...32033123...
    - ...320302233121013202103123...

    Then, what is the probability of an S occurring in a random sequence of n characters?

    Thanks for pointing out anything that is unclear in the problem... I'm having a tough time trying to define the problem in the first place...
    Last edited by cryptocrow; April 8th 2009 at 10:30 AM.
    Follow Math Help Forum on Facebook and Google+

Similar Math Help Forum Discussions

  1. [SOLVED] Probability that a 4 base DNA sequence is the same?
    Posted in the Advanced Statistics Forum
    Replies: 1
    Last Post: February 15th 2011, 10:24 PM
  2. Replies: 0
    Last Post: January 16th 2011, 02:49 PM
  3. interesting probability puzzle involving out-of-sequence series
    Posted in the Advanced Statistics Forum
    Replies: 1
    Last Post: November 20th 2010, 11:58 AM
  4. Conditional Probability/Sequence of Events
    Posted in the Advanced Statistics Forum
    Replies: 3
    Last Post: January 22nd 2009, 07:01 AM
  5. Replies: 1
    Last Post: September 4th 2007, 03:27 AM

Search Tags


/mathhelpforum @mathhelpforum