Hi, I'm trying to compare the performance of 4 optimisation algorithms on over 30 fitness functions. Due to the nature of the algorithm variations I am interested both in best performer overall and pairwise comparisons between algorithms. The data collected is minimum error at termination, which is either zero, or the best value after a number of iterations. Plotting histograms of various fitness functions for a particular optimisation algorithm shows quite a few different patterns so, along with values only being positive I am treating as non-normal.

If I do a Mann-Whitney U I can get the p value between two algorithms on one fitness function. How can this be extended to look at all the fitness functions? Mann-Whitney U for each, then Bonferroni-Holm, slect the remaining significant results (5% level?) and then count the number of 'wins' for each of the two algorithms? Is there a procedure that can take into account all the functions at once?

If I do a Kruskal-Wallis I can get the p value of same distribution across all algorithms for a particular fitness function, and I see SPSS (one option of software I have access to) will then give adjusted pairwise tests (Dunn's test?). Is there any way/value of extending this to look at all the fitness functions in one go? Or should I do the above with 6 pairwise comparisons across the whole set?

I must admit my stats, although from a maths degree, is a bit rusty and we didn't cover this exactly. As this is for a journal publication I want to get it right, plus I just want to get it right anyway and know what's going on.

On a side note - I'm really struggling to get the test results out of SPSS (need to use OMS?) to do further manipulation/reformat. Is it worth me switching to something like R?

Thanks for any help you can give,

Joe