# Repeats and strings

• Mar 25th 2013, 11:21 AM
baxy77bax
Repeats and strings
Ok so i don't know how to prove the following:

Let S be a string and |S| its length.
Let [i..j] be an interval such that 0 < i <= |S| and 0 < j <= |S|. You can also refer to [i..j] as a substring of S.
Let [i..j] be a maximum repeat such that [i..j] is repeated and [i..j+1] and [i-1..j] are not.
Let A be an alphabet and every character in S is an element of A and every character in S is repeated at leas twice.
Let p be a non-negative integer.

Prove that the length of p number of overlapping maximum repeats in S is smaller then p number of concatenated maximum repeats .

so ma max repeat is

Code:

`|------------------------|`
Prove that
Code:

```|-------------------|         |---------------------| |<---------------------------->|```
is shorter than

Code:

```|-----------------||---------------------| |<-------------------------------------->|```
Now this is the case for p = 2 but p in what i want to prove is arbitrary.

How would you do this? would you just show that for non-overlapping intervals \$\displaystyle \cap_{i=1}^{p} \mbox{Interval} = 0\$ and therefore non-overlaps are longer then overlaps or would you first show that for p=2 overlapped intervals are smaller then two concatenated ones and then by induction on p proved that this holds for all p.

What confuses me if i choose to prove by induction on p how would i do it? how would i start, what would be my induction hypothesis and what would my cases be ?? This second question is much more important if someone could help me with it, because i have different examples that i would like to do but there are not much examples on how to apply induction in cases as the one described above.

Thank you

baxy

PS
i know that this is probably trivial but pleas help because this will clear many other confusions and i will be able to make my questions more precise and clear.
• Mar 25th 2013, 01:20 PM
emakarov
Re: Repeats and strings
Quote:

Originally Posted by baxy77bax
Let S be a string and |S| its length.
Let [i..j] be an interval such that 0 <= i <= |S| and 0 <= j <= |S|. You can also refer to [i..j] as a substring of S.
Let [i..j] be a maximum repeat such that [i..j] is repeated and [i..j+1] and [i-1..j] are not.
Let A be an alphabet and every character in S is an element of A and every character in S is repeated at leas twice.
Let p be a non-negative integer.

Prove that the length of p overlapping maximum repeats in S is smaller then p concatenated maximum repeats .

The problem statement is not sufficiently clear to me. First, if string indices start with 0, then the last index is |S| - 1, so the restriction on i, j should be 0 <= i, j < |S|. Next, what exactly do you mean by saying that a substring is repeated? What is the definition of the length of p substrings?
• Mar 25th 2013, 01:58 PM
baxy77bax
Re: Repeats and strings
you are right m it should be 0 < i and 0<j , my mistake.

Quote:

Next, what exactly do you mean by saying that a substring is repeated?
i mean that it appears at leas twice in S such that a substring [i..j+1] and [i-1..j] appear onli once. so S= abbabba , then abba is a maximum repeat it appears at position 1 and position 4.

Quote:

What is the definition of the length of p substrings?
well i don't know how to express myself, this is a tailored example that i thought if i saw how it is done i could do something about the rest of my problems. so if i have a case : abbababbacbabbdabab then abba[1] \$\displaystyle \cup\$ babb[5] is longer then abba[1] \$\displaystyle \cup\$ abab[4]. Note that both abba and babb are maximum repeats and they are of teh same size. the point i want to make is that if they overlap they are shorter the if they conc. this is obvious and trivial but how to show that if i fave p such intervals, that overlap that they are shorter then if i had conc. p concesitive max repeats. Note also if max repeats are of the different sizes a pair of max repeats that is conc. vill always be longer then a pair of overlapping max repeats since otherwise one or both max repeats in conc. case are not maximum. also note that since every character in S is repeated at leas twice. there will always be max repeats stacked one next to eachother.

so you will always have max repeats like this:

Code:

```              |------------------|                     d       |---------------|             c |-----------------||---------------------|           a                    b |<-------------------------------------->|```
so i want to show that if i pick a and c or a and d they will always be shorter then if i pick a and b. but the main point is how to shot this when i have more the 2 intervals, not like in the previous case where i only have two.

Thank you