Hey pjk.
You can test the difference of proportions from two samples using a two-sample t-test which you might want to consider.
Hi
I'm comparing genomic distribution counts of two subsets of a dataset that were composed using different selection criteria.
Both subsets contain x values corresponding to the genomic startpositions of probes used on an array to assess the status of specific sequences on the genome.
subset1 subset2
a: 8281 -- 31225
b: 6323 -- 7853
c: 1397 -- 711
d: 2462 -- 2205
e: 2397 -- 2351
f: 4120 -- 317
g: 12756 -- 2659
h: 12255 -- 2679
total: 50000 -- 49991
The table above lists the number of counts per genomic category (a-h) of the two subsets. Here, subset1 represents a set of startpositions of randomly selected probes and subset 2 a set of probes that were selected according to a specific criterium.
I now want to determine whether the difference in count numbers is significant for categories individually. Basically, I would like to stastically demonstrate that 31225 does, or does not significantly differ from 8281 if the latter number is to be expected for this category when probes are selected at random.
Thanks in advance,
Pieter