Hey CSHowe.
Your expansion should have four terms as you have pointed out since you have summations that have independent indices of 2 each so 2*2 gives 4 terms in total.
Hi all,
This is probably a somewhat elementary question, but I've been working on it for some time without any luck.
I have a 2x2 contingency table:
1 0 Y 3 92 N 8 743
1,0 refers to a word (w=1 means word present; w=0 means word absent). N,Y refers to a disease (D=1 means disease present; D=0 means disease absent).
I need to calculate the information gain using the statistic (sorry for poor formatting):
I(w,D)=Sigma_{j=0,1}Sigma_{k=0,1}P(w=k,D=j)log_{2} P(w=k,D=j)
P(w=k)P(D=j)
The probability calculations are straightforward, but I am unsure how the statistics should look when expanded out - should there be 4 different calculations? For instance, (w=1,D=1)+(w=1,D=0)+(w=0,D=1)+(w=0,D=0)
Any assistance gratefully received!