Do you have the data for all the classes? it seems to me that your first question can be answered just by looking at all the data.
OK, this may be a little tricky to explain so bear with me.
Imagine a 'class' that contains samples. Each sample may, or may not, be related to other samples in different classes.
I have a number of bar charts - one for each 'class' - that show how many samples is related to other samples in a different class.
Data looks like this
Class 1 data:
and so on
What the above tells you is that for class 1, there are 345 samples that are not related to any samples in a different class, 34 samples that are related to 2 samples in a different class and so on ...
I need to find a method that tells me which class has samples that are related to the greatest number of samples from other classes and also get a general feeling for the distribution of samples being related to other samples in different classes for each graph.
I considered standard deviation but have a feeling that won't tell me much.
Thanks a lot for any help
Hi, melysion.Originally Posted by melysion
For the method, you could compare the mean number of related samples. For example, suppose this is the data.
Class 1 has 100 samples which are related to a total of 85 samples from other classes. This is a mean of .85 related samples per sample. Class 2 has more samples and related samples, but the mean of .80 related samples per sample is lower. So Class 1 is on average related to a greater number of samples from other classes.Code:Class 1 Class 2 0 50 0 100 1 25 1 60 2 15 2 20 3 10 3 20 N 100 N 200 Sum 85 Sum 160 Mean .85 Mean .80
To get a general feeling for the distribution, you could combine all the samples and create a bar chart for that.