# Thread: A number of different substrings within a string

1. ## A number of different substrings within a string

Let $S$ be a string $|S|$ its length. Let $\Sigma$ be an alphabet and $|\Sigma|$ its size. Both $|S|,|\Sigma| \in \mathbb{Z}^{+}$. Let $S[i,i+k-1]$ be a substring and $|S[i,i+k-1]|=k$ its size.

how does one prove that the highest number of different substrings of size $k$ is $|\Sigma|^{k}$

it is obvious but what i am looking for is the reasoning that gets me from $S[i,i+k-1]$ to $|\Sigma|^{k}$.

thank you

2. ## Re: A number of different substrings within a string

You write "substring" and talk about "string S" but there doesn't appear to be any requirement that your "S[i, i+ k-1]" be a substring of S. In particular, there is no reference to "S" or its length in $\displaystyle |\Sigma|^k$. $\displaystyle |\Sigma|^k$ is the number of distinct strings of length k from alphabet $\displaystyle \Sigma$. That has nothing to do with being a substring of S.

3. ## Re: A number of different substrings within a string

ok let me try to clarify

$S$ is a string
$|S|$ its length
$S[i]$ a character at position $i \in [1,|S|]$ s.t. $S[i] \in \Sigma$
$\Sigma$ is an alphabet
$|\Sigma|$ its size
$S[j,j+k-1]$ is a substring of $S$ s.t. $j\in [1,|S|-k+1]$
$k= |S[j,j+k-1]|$

did i forget something?

When you create a string of any size greater or equal to k then the number of different k-mers you get is smaller or equal to $|\Sigma|^{k}$ because as u put it $|\Sigma|^{k}$ is the number of distinct strings of length of length k that one can make form $\Sigma$. so u cannot have more then that. what connectioons am i missing to get to that point ??

Lemma as stated:

Given any string there cannot be more then $|\Sigma|^{k}$ distinct substrings of size k.

Proof : ??

4. ## Re: A number of different substrings within a string

YOU have already said that $\displaystyle |\sigma|^k$ is the total number of strings of length k. Clearly the number of strings of length k, that happen to be substrings of S, can't be larger than the number of all strings of length k. The number of substrings of S of length k is less than or equal to $\displaystyle |\Sigma|^k$.