1. ## urn question

An urn contains $2n$ balls, of which $r$ are red. The balls are randomly removed in $n$ successive pairs. Let $X$ denote the number of pairs in which both balls are red. Find $E[X]$.

I'm thinking it would be a Bernoulli distributions, therefore:

$P(X=1) = \left( \frac{r}{2n} \right) \left( \frac{r-1}{2n-1} \right)
$

$P(X=2) = \left( \frac{r}{2n} \right) \left( \frac{r-1}{2n-1} \right) + \left( \frac{r-2}{2n-2} \right) \left( \frac{r-3}{2n-3} \right)$

$\vdots$

$P(X=k) =\left( \frac{r}{2n} \right) \left( \frac{r-1}{2n-1} \right) + \left( \frac{r-2}{2n-2} \right) \left( \frac{r-3}{2n-3} \right) + \ ... \ + \left( \frac{r-k+1}{2n-k+1} \right) \left( \frac{r-k}{2n-k} \right)$

I'm not quit sure what to do next...

2. Hello,

Yup I know it's an old thread... ^^

I'd say that your approach is not correct. Because you're considering that you get the "double red" sucessively. While you don't take into account the case :
1st draw : double red.
2nd draw : red & other
3rd draw : red & red

and also, you haven't considered that X=2 means that there are exactly two "double red".

and by the way, X has to stop at $\left\lfloor \frac r2\right\rfloor$

I don't know how to explain it more clearly, but what I can say is that I think (almost sure - with a probability 99% ) your reasoning is not correct.

Two possible approaches I can think of :
- cumulative density function
- conditional expectation

But I don't know if one of them leads somewhere... as it is not that easy to represent the problem But that's a good way of bumping, eh ?

3. Originally Posted by lllll
An urn contains $2n$ balls, of which $r$ are red. The balls are randomly removed in $n$ successive pairs. Let $X$ denote the number of pairs in which both balls are red. Find $E[X]$.
Nice bump, Moo!

It is indeed difficult to find the distribution of $X$: there are correlation between pairs, etc.

But all we need is $E[X]$. The linearity of the expectation does not require r.v. to be independent, hence it should be possible to simplify the problem this way...

We have: $X=Y_1+\cdots+Y_n$ where $Y_i$ is 1 if the i-th pair is "red-red", and 0 else. Hence $E[X]=E[Y_1]+\cdots+E[Y_n]$. And the pairs have the same distribution, even though they're highly correlated, hence $E[X]=nE[Y_1]=nP(\mbox{two balls from a pair taken at random are red})$.

Finally, $E[X]=n\frac{r}{2n}\frac{r-1}{2n-1}=\frac{r(r-1)}{2(2n-1)}$ if $r\geq 2$, and $0$ else.

NB: the linearity of expectation acts like magic here!

4. If I interpret the question correctly, we are removing the balls without replacement. So the pairs do not have the same distribution since the probability of picking a pair of red balls changes with each removal.

5. Originally Posted by cl85
If I interpret the question correctly, we are removing the balls without replacement. So the pairs do not have the same distribution since the probability of picking a pair of red balls changes with each removal.
The balls are indeed removed without replacement, and that's why the pairs are strongly correlated. However, the pairs are identically distributed. This may seem paradoxal, and I have noticed this is a common probabilistic misconception in everyday life.

----
A few months ago, while playing Scrabble with my parents, I came upon the following situation. There were still many tiles (letters) in the bag. My mother had just played, but she forgot to take new tiles from the bag. Then my father played, and took new tiles. My mother interrupted him: "Wait, I should have picked my tiles before you. Please put your new tiles back, so we come back to the previous situation and I can pick my tiles fairly."

I protested, claiming it was the complete opposite: it would not have mattered at all if my mother had taken her tiles after my father or before (provided there are sufficiently many tiles in the bag). While, since my father had seen his new tiles, the distribution was about to be altered if he put them back in the bag. My mother protested: "But there were more tiles before, it can't be the same. For instance there may be no "E" anymore while there could be a few before."

Yet it is the same. And this is exactly the same problem here with the pairs of balls. Btw, my mother's protest matches yours: "There are less red balls at the end"... Right. But only conditionally to the previous pairs!! Otherwise, there is no difference, paradoxically.
----

First, an example. Take the balls one by one from the box. What is the probability that the last ball is red? It seems smaller than $\frac{r}{2n}$... Let's compute it elementarily. It is the probability that among the first $2n-1$ picked balls, there were $r-1$ red balls (and all the others), hence it is:

$\frac{{r\choose r-1}{2n-1\choose 2n-1}}{{2n\choose 2n-1}}=\frac{r}{2n}.$

Exactly like for the first ball.
----

Consider the set $E=\{1,\ldots,n\}$, where $n\ge p+q$. Pick, without replacement, a random subset $A$ of $p$ elements (uniformly), and then a random subset $B$ of $q$ elements.

Claim: the distribution of $(A,B)$ is unchanged if we pick $B$ before $A$.

What about a computational proof to get convinced (of how natural this is)? Let $x_1,\ldots,x_p,y_1,\ldots,y_q$ be distinct elements of $E$. Then:

$P(A=\{x_1,\ldots,x_p\},B=\{y_1,\ldots,y_q\}) = \frac{1}{{n\choose p}}\frac{1}{{n-p\choose q}}=\frac{p!q!(n-p-q)!}{n!}$

is obviously independent of the order of the choices: it is symmetric in $p$ and $q$.

I let you derive a general case with $k$ random disjoint subsets; any order gives the same distribution. In particular, the order of the choices of the pairs in the initial problem is irrelevant for their distribution, hence they have same distribution.

,

,

### An urn contains 2n balls, of which r are red. The balls are randomly removed in

Click on a term to search for related topics.