# Thread: Prob of a letter bag from a set of letters

1. ## Prob of a letter bag from a set of letters

Say I have a set of letters totaling n (say 80 for ex)
(some letters may occur more than once in this set)

I randomly choose r letters from n (say r= 4)
What is the probability of seeing the letter bag
'aabcd' (it's a bag so order doesn't matter, 'adcab' is also ok)
Note that there are 2 a's in this bag.

The general form of the bag is:
k_a of letter a
k_b of letter b
..
all total to k

Assuming that there are n_a of letters a
n_b of letters b
n_c of letters c
and so on ..
from the set of n letters

My naive answer is
[n_aCk_a * n_bCk_b * ... ] / nCr

And I know this is not correct
Can anybody help me get the correct answer for this?

Thanks

2. ## Re: Prob of a letter bag from a set of letters

Hey lettertha.

After pick something out, do you put it back (sampling with replacement) or do you keep it (sampling without replacement)?

If you replace it then you have a multinomial distribution. If you don't replace it, you have a multi-variable hyper-geometric distribution.

You can look at the general proofs of these distributions to understand how to derive the combinatoric identities.

3. ## Re: Prob of a letter bag from a set of letters

Originally Posted by lettertha
Say I have a set of letters totaling n (say 80 for ex)
(some letters may occur more than once in this set)
I randomly choose r letters from n (say r= 4)
What is the probability of seeing the letter bag
'aabcd' (it's a bag so order doesn't matter, 'adcab' is also ok)
Note that there are 2 a's in this bag.
The general form of the bag is:
k_a of letter a
k_b of letter b
Totally disregarding reply #2, your question is totally meaningless.

You must tell us about the distribution of the letters in the bag.

For example, in your example the bag could contain each letter of the English alphabetic and 54 "X's".

You must supply more detail. As written your post is meaningless.

4. ## Re: Prob of a letter bag from a set of letters

Ok, here is the details:

I have in my collection (big letter bag):
n_a number of a's
n_b number of b's
n_c number of c's

all of them sum up to 'n'
There is NO replacement.

I will draw 'r' rounds, one letter in each round.

What is the prob of seeing a letter bag (small letter bag)
which has:
k_a of the letters a's
k_b of the letters b's
...
(assuming that k_a < n_a; k_b < n_b; etc...)
assuming that all these k_* sum up to k such that k < r

I am hoping to get a combinatoric formula for the prob of seeing the small letter bag

Ex:
Say I have in the big bag: 5 a's; 6 b's; 8c's
I want to find the prob of seeing 2a, 2b, 3c after randomly drawing 7 letters from the big bag

5. ## Re: Prob of a letter bag from a set of letters

This is known as the multinomial distribution:

Multinomial distribution - Wikipedia, the free encyclopedia

Take a look at the derivation of the proofs to understand the combinatoric results involved.

6. ## Re: Prob of a letter bag from a set of letters

Hmm,
This is interesting

Thanks a lot to 'chiro'
It looks like the problem is definitely of 'Multivariate hypergeometric dis' since there is no replacement
Hypergeometric distribution - Wikipedia, the free encyclopedia

So my initial naive answer is actually correct as it matches the Probability Mass Function formula
I guess I underestimate my combinatorial skill too much