I'm studying computer science and hoped someone could help me with a problem.
I want to test different hypotheses against sets of data. Y will always be ranked.
X | Y
1.2 | 1
3.4 | 2
2.9 | 3
X | Y
3.9 | 1
2.4 | 2
9.7 | 3
n is likely to be between 10 and 30 for each set and Y will have tied ranks. If ranked X may also have tied ranks.
The hypotheses will need to be ordered in their likelihood of being correct. I decided that this will be achieved by finding the greatest average correlation coefficient.
Ideally I would like to try this two different ways: Where both sets are ranked and another taking into account the distribution of X.
The problem however is that I don't know how the coefficients are distributed. From looking at the formulas I would guess that Kendall-tau is linearly distributed and Spearman's rank is not. So is finding an average meaningful? If not is there an alternate statistical approach I could take?
Same goes for PMCC for if I want to take the distribution of X in to account.
I hope all this made sense. Let me know If I've used something out of place or just confused myself (I'm no mathematician).