# Thread: how tell whether distributions are correlated?

1. ## how tell whether distributions are correlated?

Hi Experts,

I am analyzing real data using fast-fourier transforms (FFT) in Matlab. The FFT magnitude spectrum show some background noise floor with several sharp spurs popping up high out of the background noise. I need to figure out conclusively which of these spurs are correlated with which other spurs (if any).

To simplify this problem let me just analyze two spurs, to see if they are correlated or not. Let me call them spur1 and spur2. I process the data to obtain three probability distribution functions (PDFs):

1) I isolate spur1 and do an inverse FFT only on spur1 to obtain its' respective real-time waveform (a sinusoid of a certain frequency). I take the PDF of this waveform (PDFspur1).

2) I isolate spur2 and do an inverse FFT only on spur2 to obtain its' respective real-time waveform (a sinusoid of a different frequency). I take the PDF of this waveform (PDFspur2).

3) I isolate both spur1 and spur2 from the rest of the spectrum and take an inverse FFT on the spectrum containing both spur1 and spur2. This results in a real-time waveform whose PDF I'll call PDFspur12.

I want to conclusively determine if these 2 spurs are correlated with each other. How do I do it?

One thought I have is, if the distibutions (PFDs) are statistically independent (that is, uncorrelated) then PDFspur12 should EQUAL PDFspur1 CONVOLVED with PDFspur2. If they are NOT equal, then they are not correlated.

I think this is mathematically sound, but I'd appreciate any comments/feedback, especially if you know a better/faster/more conclusive way to determine this. Best regards, -GK

2. Originally Posted by ggk
Hi Experts,

I am analyzing real data using fast-fourier transforms (FFT) in Matlab. The FFT magnitude spectrum show some background noise floor with several sharp spurs popping up high out of the background noise. I need to figure out conclusively which of these spurs are correlated with which other spurs (if any).

To simplify this problem let me just analyze two spurs, to see if they are correlated or not. Let me call them spur1 and spur2. I process the data to obtain three probability distribution functions (PDFs):

1) I isolate spur1 and do an inverse FFT only on spur1 to obtain its' respective real-time waveform (a sinusoid of a certain frequency). I take the PDF of this waveform (PDFspur1).

2) I isolate spur2 and do an inverse FFT only on spur2 to obtain its' respective real-time waveform (a sinusoid of a different frequency). I take the PDF of this waveform (PDFspur2).

3) I isolate both spur1 and spur2 from the rest of the spectrum and take an inverse FFT on the spectrum containing both spur1 and spur2. This results in a real-time waveform whose PDF I'll call PDFspur12.

I want to conclusively determine if these 2 spurs are correlated with each other. How do I do it?

One thought I have is, if the distibutions (PFDs) are statistically independent (that is, uncorrelated) then PDFspur12 should EQUAL PDFspur1 CONVOLVED with PDFspur2. If they are NOT equal, then they are not correlated.

I think this is mathematically sound, but I'd appreciate any comments/feedback, especially if you know a better/faster/more conclusive way to determine this. Best regards, -GK
Why would you think this is mathematicaly sound? What are you trying to do?

Why are you not just comparing the frequencies to see if they are both (small) multiples of a common frequency?

RonL

3. Hi Ron, Good questions.

Turns out looking at frequency multiples isn't always possible mainly because there are times where one frequency is a multiple of another but they are not correlated. Additionally, the multiple could be a fraction, and there's no knowledge of what the multiple to be expected is. With dozens of spurs there is always the chance spurs just happen to be multiples of each other by coincidence. I've tried to use this aspect of correlation but it turns out frequency multiples isn't sufficient to determine correlation.

I would think the scenario described above would be sound because if we convolve two independent distributions (PDFs of sinewaves for example), the resulting distribution should be larger compared to either of the original distributions. If the PDF from spur1 has a width (x-axis) of A and PDF from spur2 has a width of B, then the convolution if they're independent should be a PDF that is A+B wide. However, if the two distributions are correlated, taking the IFFT of both spurs should produce a PDF that is the same (or, similar?) width as the larger of the two spur's individual PDF. I'm trying to use the knowledge that the convolution of two independent distributions should have an observably larger width compared to the PDF resulting from the IFFT of two spurs that are correlated.

For example, a square wave has a PDF with two impulses (the impulses are separated by peak-peak amplitude of the square wave). The square wave's spectrum has spurs at odd harmonics. As more spurs are included when taking an IFFT, the PDF approaches the ideal two-impulse shape, while the width of the PDF doesn't really change. Compare this to taking the IFFT of spurs individually, then convolving these PDFs together -- each convolution operation causes the resulting PDF to grow in width. Does that make sense? I'm trying to think about properties I can exploit to determine whether the distributions are correlated or not, and this is my simple-minded approach.

Best regards, -GK