# Thread: how to determine if a signal in several data streams "tends to occur together"?

1. ## how to determine if a signal in several data streams "tends to occur together"?

I have six strings of data; each string consists of bits obtained from a device run over the same 12 hour period (the six experiments were run on six devices simultaneously). Most of the data are 0's, but occasionally (about 5% of the time) there's a 1. I want to test the hypothesis that the strings are synchronized, such that if a "1" occurs in one string, it's more likely to occur in the same position in the other strings (or at least close - within a couple of positions). Basically the idea is to test whether the positions of 1's in the strings are completely independent of each other, or if there's a significant correlation among them such that a "1" in one string is indicative of a higher probability of a 1 in the other strings. What statistical test should I use?

2. ## Re: how to determine if a signal in several data streams "tends to occur together"?

I don't know of a standard test specifically designed for this purpose, but here is an idea that may work for you. Count the number of positions in the string which contain 0 ones, 1 one, 2 ones, ..., 6 ones. If the strings are independent, then these numbers will follow a Binomial(n=6, p) distribution, where p is the fraction of ones. Specifically, if L is the length of the strings and $\displaystyle X_i$ is the number of positions with i ones, then
$\displaystyle E(X_i) = L \binom{6}{i} p^i (1-p)^{6-i}$
You can then use a contingency table (based on the chi square statistic) to compare the expected counts from this formula with the actual counts from your experiment.

3. ## Re: how to determine if a signal in several data streams "tends to occur together"?

 how many 1's among the 6 channels: 0 1 2 3 4 5 6 observed (in 200 trials): 0.75 0.205 0.04 0.005 0 0 0 expected, binom.dist 0.735 0.232 0.031 0.002 0 0 0

Did I do it correctly? Looks very similar, and a chi squared test (using only categories 0-3, since 4-6 have zeros in the Observed) says 0.999726516, or exactly as expected. Does that look right?

thanks!

4. ## Re: how to determine if a signal in several data streams "tends to occur together"?

I think you want to multiply the numbers in your table by 200 in order to get the expected numbers of ones before computing the chi squared statistic; the test deals with counts, not proportions. You also should combine counts for number of ones = 2, 3, 4, 5, and 6 into a single category (2 or greater) before computing the statistic, since the expected numbers are so low in those categories. The usual rule of thumb for a chi square test is that the expected number in a bin should be at least 5.

5. ## Re: how to determine if a signal in several data streams "tends to occur together"?

Right, makes perfect sense. I did it, and p value is lower but still not significant (~0.5 or so). Thank you!!