# Expected number of collisions

• Dec 1st 2009, 04:23 AM
MJ0
Expected number of collisions
I posted this thread in the 'pre-university probability and statistics'-section and so far there have not been any replies. Probably I should have posted it it in the advanced section in the first place.

I have had a hard time trying to solve the following problem. I would be pleased if anyone can help me on this (this problem's description actually originates from a hashing problem, known in computer science, which I have tried to simplify by describing a more general understandable situation):

There are m open boxes, each of which can contain at most one ball. There is a ball throwing machine that fires a total of n balls towards the boxes, one by one. For each box, the probability that when the machine throws a ball, the ball falls onto that specific box, is 1/m (equal for each box). The thrown ball will always fall onto one of the boxes. However, the probability that the ball will actually fall into any box is less than 1 after the first throw (provided that n > 1), because a box which is already filled with a ball (from the previous throw(s)), will prevent the thrown ball from falling into that box. Call this a collision. Assume that a thrown ball, which collides with one of the balls in a box, simply disappears.

I need a formula with which I can calculate the expected number of collisions, given m boxes and n balls.

• Dec 1st 2009, 05:00 AM
Laurent
Quote:

Originally Posted by MJ0
I posted this thread in the 'pre-university probability and statistics'-section and so far there have not been any replies. Probably I should have posted it it in the advanced section in the first place.

I have had a hard time trying to solve the following problem. I would be pleased if anyone can help me on this (this problem's description actually originates from a hashing problem, known in computer science, which I have tried to simplify by describing a more general understandable situation):

There are m open boxes, each of which can contain at most one ball. There is a ball throwing machine that fires a total of n balls towards the boxes, one by one. For each box, the probability that when the machine throws a ball, the ball falls onto that specific box, is 1/m (equal for each box). The thrown ball will always fall onto one of the boxes. However, the probability that the ball will actually fall into any box is less than 1 after the first throw (provided that n > 1), because a box which is already filled with a ball (from the previous throw(s)), will prevent the thrown ball from falling into that box. Call this a collision. Assume that a thrown ball, which collides with one of the balls in a box, simply disappears.

I need a formula with which I can calculate the expected number of collisions, given m boxes and n balls.

Let $\displaystyle N$ be the (random) number of collisions.
Since the $\displaystyle m$ boxes are identical in this problem, we have:

$\displaystyle E[N]=mP(\mbox{there is a collision in the first box}).$

And a collision happens in the first box iff at least two balls fall onto it, iff neither 0 nor 1 ball exactly fall onto it.

0 ball in the first box happens with probability $\displaystyle \left(\frac{m-1}{m}\right)^n$
1 ball in the first box happens with probability $\displaystyle n\left(\frac{m-1}{m}\right)^{n-1}$
Thus,

$\displaystyle P(\mbox{there is a collision in the first box})=1-\left(\frac{m-1}{m}\right)^n-n\left(\frac{m-1}{m}\right)^{n-1}.$

and finally

$\displaystyle E[N]=m\left(1-\left(\frac{m-1}{m}\right)^n-n\left(\frac{m-1}{m}\right)^{n-1}\right).$
• Dec 1st 2009, 06:37 AM
MJ0
Quote:

Originally Posted by Laurent
1 ball in the first box happens with probability $\displaystyle n\left(\frac{m-1}{m}\right)^{n-1}$

The n-multiplier allows the probabilty to be more than 1(?) Correct me if I'm wrong, but that should not be possible. Also this formula does not work on an example I worked out with 2 boxes and 2 balls.

If I denote box number x with [x] and ball number x, or the x-th ball thrown, with (x), I have the following set of combinations with 2 boxes and 2 balls:

1. [1](1) [2](2)
2. [1](2) [2](1)
3. [1](1)(2) [2]
4. [1] [2](1)(2)

Note that, for example the combination [1](1)(2), means that the first ball thrown (1) followed by the second ball thrown (2) both land on box [1] (actually, (2) disappears because of the collision, but this does not matter here, as just the combinations are examined).

In this case there are 4 combinations, two of which exactly have 1 ball landing in box 1 (combination 1 and 2).

Your formula (created for the situation when looking at a specific box, exactly one ball will fall into that box) is supposed to give 2/4 (= 1/2) here. Instead your formula gives 2*(1/2)^1 = 1.

If you work out an example with 3 boxes and 3 balls, 12/24 combinations (also 1/2) will have exactly 1 of the balls landing in [1]. I don't know how to derive a good formula for this. Maybe the rest of your solution also has to be modified, as in light of the examples I gave you may have a different view about the problem description now. As I said, I may not have given a fully clear view of the problem in the first post.
• Dec 1st 2009, 07:42 AM
Laurent
Quote:

Originally Posted by MJ0
The n-multiplier allows the probabilty to be more than 1(?) Correct me if I'm wrong, but that should not be possible. Also this formula does not work on an example I worked out with 2 boxes and 2 balls.

Indeed, it should be $\displaystyle n\frac{1}{m}\left(\frac{m-1}{m}\right)^{n-1}$ (n choices for the one ball to fall into box 1, and probability $\displaystyle \frac{1}{m}\left(\frac{m-1}{m}\right)^{n-1}$ for the first ball to be in box 1 and the others in other boxes) Does it look better?