The probability of any single event can be expressed as the multiple of each probability for all necessary conditions required for that event. For example, if you want to roll snake eyes, the probability of doing so is found by multiplying the probability of the two required conditions: rolling a one for the first die (P=1/6) and rolling a one for the second die (P=1/6). This multiple is (1/6)(1/6)=1/36. So, the probability of rolling snake eyes is 1/36.

The same sort of thing goes on here. But remember that this is a series, and so we're not looking for the total probability, but only the probability for each trial.

Let's say thatris the stopping number. In other words, you only start considering secretaries after you've interviewed at leastr-1 secretaries. Let's also say thatj-1 is the number of secretaries you've considered so far. That makes each secretary you interview thejth secretary.

Now that we've assigned the variables, let's ask ourselves, what is the probability that the secretary I'm interviewing--thejth secretary--is the most qualified? Well, there's only one best out of thensecretaries, so the probability of some random secretary being the best is:

That's our first condition. The other condition which must be satisfied is the probability of even considering thisjth secretary. Remember, we're going to stop once we get to the best afterr-1 secretaries. For us to have moved on to thejth secretary means that so far, the best secretary was one of those firstr-1 secretaries, otherwise we wouldn't be considering thejth secretary. And the probability that the best so far was one of the firstr-1 is:

So, since these are the only two conditions which need be met, we know that the probability of thejth secretary being the best is:

Next we add up the probabilities for each trial. This is done by converting to a series:

I know I didn't explain that last step very well. Quite frankly, I'm not sure how to put it into words. Maybe Mr. Fantastic can help out with an explanation. Hopefully you can just see how it works, though.

This can be done becauseThis becomes

(r-1 / n) * ∑ (1 / j-1)

from j=r to nrandnare both constants, and we can pull out constants.jis a variable, however, so we must keep any term which includes ajinside the sigma summation.

I have no clue about that one. Mr. Fantastic?And then P(r) =

x * ∫ (1/t) * dt

integrating t over x and 1.