The sum is ultimately over the set of pairs . You can imagine this set as a matrix. It does not matter whether you sum it rows first or columns first; you have to compute for each pair and then add them all together.
In the discrete case, is the summing notation used here interpreted such that I have to sum with regard to the sample space of Y, and then sum the result of that with respect to the sample space of X? Or do I consider both sample spaces and just sum once? The former sounds like a lot of work.
Thanks!
The sum is ultimately over the set of pairs . You can imagine this set as a matrix. It does not matter whether you sum it rows first or columns first; you have to compute for each pair and then add them all together.