1. ## [SOLVED] DNA Statistics

4 DNA base pairs in a sequence of 9. Find the number of unique sequences.

Thanks!

2. You must explain a bit more about DNA molecules.
Are they a fixed length? Your example has length seven. But you use the exponent nine.
Can all the letters be the same? What are the requirements for the distribution of the letters?

Maybe you can give us a clear description.

3. thanks you so much for replying!

They are at the fix length of 9
Yes, the letter can all be the same.
There are no requirements for the distribution.

The only requirement is to find the number of unique sequences there are, without counting double (aka, one is the same as another flipped around).

4. Thank you for explaining.
You risk of over counting comes in counting palindromes.
Say we had “AATGCGTAA” that is a palindrome.
If we reverse it we have the same, so no need to divide by 2.

Now for each non-palindrome “ATCTTCGGA” we could also have “AGGCTTCTA” the reverse and want to count it once.

Now there are $4^5$ palindromes of length nine made of four letters.
Then there are $4^9 - 4^5$ non-palindromes.

So if I understand this there are $4^5 + \frac{4^9 - 4^5}{2}$ of the DNA sequences.

Again, I may not fully understand you example.

5. Thanks!

But I know that the answer is suppose to be huge (more than 10 million?).

6. Originally Posted by Starry
But I know that the answer is suppose to be huge (more than 10 million?).
That is impossible for the problem as you put it.
Look there are only $4^9=262,144$ possible string of length nine made from the four $ACGT$.
There must be more to it. Longer strings?

PS