ok, so i have 3 sets, each with 4 elements. These 3 sets are subsets of the universal set which is 60 elements.
what i need to figure out is the correct formulae to calculate the percentage of combinations containing 9 elements from the universal that contain at least one element from each of the 3 subsets.
originally i tried to go with C(4,1)*C(4,1)*C(4,1)*C(57,6)/C(60,9) - which was obviously wrong because i got an answer somewhere along the lines of 1.05. Which sucks. because logically i shouldn't get a probability of 1 until C(4,1)*C(4,1)*C(4,1)*C(57,54)/C(60,57). Because in all other cases there is at least one set that has all four of one of the subsets outside the selection.
So i hit the books for a bit and found the hyper geometric distributions algorithm - which the only version i found that deals with anything similar to my problem seems to require the subsets to be equal - every example i've seen is in any case (or at least not overlapping - C(57,6) obviously does overlap with each instance of C(4,1), and that is probably my problem here.) unless operating in the singular. And looks exactly like my algorithm. so nuts. it''s as follows (in case i've got it wrong)
h(x; N, n, k) = [ kCx ] [ N-kCn-x ] / [ NCn ]
N is the total set size eg 60.
n number of objects drawn from the set eg from 60 select 9
l number of possible successes eg 4 for a particular subset
x is how many of selected items must be successes eg 2 of the 5 selected balls must be red.
which is the same algorithm i was using but for the singular, and you just need to calculate for x=0 and subtract the result from 1 to get the probability of at least one member of a particular subset being in the selection. so my problem is i've been looking at this stuff for too long and only keep finding the same stuff i've already tried/figured out, solve easily for the one subset but put all three together and it falls apart. If someone could explain to me how i've gone wrong or perhaps direct me to some further reading material on the subject, that would be grand. At this point i think the solution might be to change the C(4,1)*C(4,1)*C(4,1)*C(57,6)/C(60,9) to C(4,1)*C(4,1)*C(4,1)*C(48,6)/C(60,9) and a recursive algorithm to go through all the permutations of C(4,0<x<5) and then sum the probabilities together. I think i'm wrong and there is a simpler/more elegant solution.
also, sorry for wall of text but if i don't work through the problem progressively like this i tend to end up with a mess that makes no sense at all to anyone but me.


LinkBack URL
About LinkBacks



