The first step in solving questions of this sort is recognizing what sort of probability distribution you are going to need to implement. Each basic probability distribution (binomial distribution, negative binomial distribution, hypergeometric distribution, etc), is developed by proposing a generalized type of experiment.

For instance, here is how the binomial distribution is developed (this excerpt is from Devore's probability and statistics book):

"There are many experiments that conform either exactly or approximately to the following list of requirements:

1. The experiment consists of a sequence of n smaller experiments called trials,

where n is fixed in advance of the experiment.

2. Each trial can result in one of the same two possible outcomes (dichotomous

trials), which we generically denote by success (S) and failure (F).

3. The trials are independent, so that the outcome on any particular trial does not

influence the outcome on any other trial.

4. The probability of success P(S) is constant from trial to trial; we denote this

probability by p."

So, if the experiment you are considering, which would be the particular problem you are working, fits the experiment outlined for the binomial probability function, then that is the function you'll want to use.

I suggest looking at how each probability distribution is developed, in particular, the negative binomial distribution; each probability distribution corresponds to a different experiment. All these problems involve is recognizing what probability distribution corresponds to the experiment in your problem.

EDIT: If anyone see's any fallacious ideas in my post, please inform me.