# Thread: Require assistance with a distribution problem

1. ## Require assistance with a distribution problem

Hi all. I have stumbled across this website by simply googling maths forums - boy am I glad this site is here.

I have a recurring problem with a series of projects I am working on, and this problem is, unfortunately for me, probability related. I studied data mining, PDE's, and calculus, and only completed the very basic level stats subjects at Uni, so I'm lost when probability questions get more than a little complex.

The type of problem I have is this:
Code:

There are p holes
There are p balls, q of which are red, r of which aren't red
Of the p holes, s of those holes are 'special'
Of all the possible distributions of the p balls amongst the p holes, what is the chance that n of the 'special' holes will hold red balls?
If that's too much to wrap your head around, here's a numerical example:
Code:

30 holes
10 red balls, 20 white balls
5 holes are 'special'
Of all possible distributions, what are the chances that:
0 special holes will contain red balls
Only 1 special hole will contain red ball
Only 2 special holes will contain red balls
.....
All 5 special holes will contain red balls
I'm totally at a loss with this sort of problem...

2. ## Probabilities and combinations

First, some assumptions: I assume all 30 holes must have one and only one ball in them, and that there's no bias about which holes are special (in other words, a special hole has the same chance of having a red ball as a non-special hole).

Next, we need to know the probability of a red ball appearing in a specific hole. It would appear to be 10 / 30 = 1/3. This suggests that the probability of a white ball appearing in a hole is 2/3, since every hole must have one and only one ball in it.

Now let's assume we select five holes at random as being "special." The chance of having no red balls in any of the holes is the same as the chance of having a white ball in every one of them, which is:

2/3 * 2/3 * 2/3 * 2/3 * 2/3 = 13.2%.

Note that there's only one way for this to happen.

What about 1 red (R) ball and 4 white (W) balls, which I'll denote as RWWWW? (This means the first hole has a red ball, the next four have white balls). Using the probabilities above, we get the following:

1/3 * 2/3 * 2/3 * 2/3 * 2/3 = 6.6%.

Note, however, there are five ways we could get one red ball and four white balls: RWWWW, WRWWW, WWRWW, WWWRW and WWWWR. So the probability of one red ball showing up in five randomly-selected holes is 6.6% * 5 = 32.9%.

It gets a little more complicated with 2 red balls and 3 white. First, the probability of RRWWW:

1/3 * 1/3 * 2/3 * 2/3 * 2/3 = 3.3%.

But how many ways can we get 2 red balls and 3 white balls in five holes? This is calculated using "combinations." The formula for combinations is N! / [ K! * (N! - K!) ], where N (in this case) is the number of holes with a ball in it, and K is the number of red balls we want to see (and ! denotes factorial, where 5! = 5*4*3*2*1). So in this case, we have 5! / [2! * 3! ] = 10 ways we can get 2 red balls and 3 white balls into 5 holes. And 10 * 3.3% = 32.9%. So there's approximately a 1-in-3 chance of randomly selecting 5 holes and finding 2 red balls in them.

Similarly:
Probability of RRRWW = 1/3 * 1/3 * 1/3 * 2/3 * 2/3 = 1.6%.
Number of ways to get 3 red and 2 white = 5! / [3! * 2! ] = 10
Probability of 3 red and 2 white in 5 holes = 1.6% * 10 = 16.5%

4 Red and 1 white:
Probability of RRRRW = 1/3 * 1/3 * 1/3 * 1/3 * 2/3 = 0.8%.
Number of ways to get 4 red and 1 white = 5! / [4! * 1! ] = 5
Probability of 4 red and 1 white in 5 holes = 0.8% * 5 = 4.1%

5 Red and 0 white:
Probability of RRRRW = 1/3 * 1/3 * 1/3 * 1/3 * 1/3 = 0.4%.
Number of ways to get 5 red and 0 white = 5! / [5! * 0! ] = 1
Probability of 5 red and 0 white in 5 holes = 0.4% * 1 = 0.4%

Note that the above encompasses all possibilities, so the sum of the probabilities on all of the above options =
13.2% + 32.9% + 32.9% + 16.5% + 4.1% + 0.4% = 100.0%

I hope this helps.

- Steve J

3. N! / [ K! * (N! - K!) ]
That's the part I was neglecting. Thanks Steve. As I said, this branch of maths isn't my strong suit, and something like that is easily overlooked when you've only done it a handful of times.

Give me data mining software and empirical data any day hehe Thanks again