# Thread: How many times do I need to subsample to sample all?

1. ## How many times do I need to subsample to sample all?

Lets say I have 10 million unique balls and I plan to subsample 1 million of the balls without replacement multiple times. Each time I go to subsample 1 million new balls, however, the balls will be replaced. How many times do I need to subsample the have subsampled all 10 million of the unique balls?

2. ## Re: How many times do I need to subsample to sample all?

Originally Posted by yahoo123
Lets say I have 10 million unique balls and I plan to subsample 1 million of the balls without replacement multiple times. Each time I go to subsample 1 million new balls, however, the balls will be replaced. How many times do I need to subsample the have subsampled all 10 million of the unique balls?
You can't give a certain answer to that. Say you have $n$ balls and select $k$ each time. Think of a particular ball, say the only red ball in the batch. The number of ways to miss the red ball in your sample is $\binom {n-1} k$. So after $m$ samples, the probablility of missing the red ball every time is $\left( \frac {\binom {n-1} k}{\binom n k}\right ) ^m$, which may be very small but not zero. To look at an example with smaller numbers, say $n=5$ and you sample $k = 1$ ball each time. The probablility of missing the red ball each time is $\frac 4 5$. After m trials the probablility of not having drawn the red ball is $\left( \frac 4 5\right) ^ m$. That number is never zero so there no number of trials that will guarantee success.

Thank you!