# Thread: Efficient solution to "risky drivers" problem

1. ## Efficient solution to "risky drivers" problem

I have a solution to a practice problem that took me a long time, and doesn't really match any of my multiple choices.

50% of drivers are low-risk, 30% are moderate risk, and 20% are high risk. For any given group of 4 drivers, what are the odds that the number of high risk drivers is 2 greater than the number of low? (I called this outcome "success".)

What is the fastest way to do this? My ways seemed tedious and make me suspect I'm missing something.

At first I thought I should use means and standard deviations and overlay bell curves, but then I though that was going to give me a lot of fractional drivers which don't make any sense.

Then I decided I'd use the cumulative distribution function. First, I calculated the odds of their being each of the five possibilities for number of high-risk drivers. I found:
PH(0) =0.410 (failure)
PH(1) =0.410 (failure)
PH(2) =0.154 (see below)
PH(3) =0.026 (success)
PH(4) =0.002 (success)
After each number, I noted whether that outcome indicated fulfilled the condition.

For the 2 high-risk scenario, I calculated a 0.625 chance that each of the remaining drivers would be low risk. (This is intuitive based on the 3 to 5 ratio of non-high-risk drivers, but I'm not sure I have the math proof down.)

(I know the below notation is probably not right, but maybe you can make sense of it.)
P(L=2|H=2) =0.391 (failure)
P(M=2|H=2) =0.141 (success)
P(L=1 and M=1|H=2) =0.468 (failure)

I added the probabilities of my three success conditions and found:
0.026 + 0.002 + (0.154x0.141)= 0.050

I'm doubting this is the correct answer, because it is only moderately close to one of the choices provided (0.06). (Let me know if I need to show more work; I am pretty sure I got the binomial probability distribution function correct, but I may have gotten mixed up in one of the exponents.)

2. Originally Posted by Boris B
I have a solution to a practice problem that took me a long time, and doesn't really match any of my multiple choices.

50% of drivers are low-risk, 30% are moderate risk, and 20% are high risk. For any given group of 4 drivers, what are the odds that the number of high risk drivers is 2 greater than the number of low? (I called this outcome "success".)

What is the fastest way to do this? My ways seemed tedious and make me suspect I'm missing something.

At first I thought I should use means and standard deviations and overlay bell curves, but then I though that was going to give me a lot of fractional drivers which don't make any sense.

Then I decided I'd use the cumulative distribution function. First, I calculated the odds of their being each of the five possibilities for number of high-risk drivers. I found:
PH(0) =0.410 (failure)
PH(1) =0.410 (failure)
PH(2) =0.154 (see below)
PH(3) =0.026 (success)
PH(4) =0.002 (success)
After each number, I noted whether that outcome indicated fulfilled the condition.

For the 2 high-risk scenario, I calculated a 0.625 chance that each of the remaining drivers would be low risk. (This is intuitive based on the 3 to 5 ratio of non-high-risk drivers, but I'm not sure I have the math proof down.)

(I know the below notation is probably not right, but maybe you can make sense of it.)
P(L=2|H=2) =0.391 (failure)
P(M=2|H=2) =0.141 (success)
P(L=1 and M=1|H=2) =0.468 (failure)

I added the probabilities of my three success conditions and found:
0.026 + 0.002 + (0.154x0.141)= 0.050

I'm doubting this is the correct answer, because it is only moderately close to one of the choices provided (0.06). (Let me know if I need to show more work; I am pretty sure I got the binomial probability distribution function correct, but I may have gotten mixed up in one of the exponents.)
Use the Multinomial distribution

RonL

3. Wikipedia's formula on this looks pretty much like that in my book:

but I'm still not getting it. I think I've just forgotten what the ellipses mean. The ellipses with commas around them (left side of equal sign) seem to mean "this formula is correct for x1 when you plug that in, and correct for all other values of x up to xk when you plug in those values".

But it looks like for any value of x, all the values of x get plugged in. I think this is true because the ellipses on the right side of the equal sign don't have commas, meaning "multiply all of the values of x together" and "multiply all the value of {p raised to the appropriate power of x} together". This yields an equal probability for each value of x, which is obviously wrong.

Going from one value of x to another, what changes on the right side of the equal sign?

4. What I did on paper (before the above post) was assume that commas were missing around the ellipses on the left side of the equal sign. Thus ...

Trying to calculate the probability that the number of high-risk driver equals 3, I used:
$(n! / x) p^x = (24/3) (0.2^3) = 0.0640$

(I was calculating that as one of the possible ways for success conditions to be fulfilled (i.e., if there are 3 high-risk drivers in n=4, there will always be at least 2 more high- than low-risk). But I'm pretty sure 6.4% is too high.) Anyway, I just included that on the off chance I was right about the commas.

5. Okay, I figured this out. I didn't realized p(0,0,4) was the possibility of a complete outcome, corresponding to "no low-, no medium-, and four high-risk drivers". So you're not supposed to do the calculation once for each time there is something between the commas on the right-hand side! Much more efficient than I had thought.