# Thread: Convolution with a Gaussian

1. ## Convolution with a Gaussian

Hello everyone,

I need help with some maths/programming problem...
I try to smoothe a discrete probability function (of a data set) with a Gaussian distribution. I understand that I have to use a convolution:
(g*f)(m) = sum(n=1 -> m) f(n)g(m-n); this is a sum in my case because it's a discrete convolution. The function f(x) is my original probability distribution, the function g(x) is a Gaussian distribution.

I'm getting stuck with how to actually make the convolution work... it seems easy, but I seem to be getting too big numbers.
More detail on my original distribution: it is defined for 0 <= x <= 6 in steps of 0.05 (I've got a concrete number at each step) and it integrates to an area of 82 (that's my total number of data objects). Now I want to convolve it with a Gaussian with sigma = 0.4, and I use
g(i) = 1/(0.4 * sqrt(2.0*3.1415926))*exp(-((x(i)**2)/(2* (0.4**2))))
which should be normalised to 1.

First question now: where does the centre of the peak of the Gaussian have to be - I think it shouldn't matter, should it; but my program spits out different numbers if I shift the peak. I think however that if I leave the peak at zero, then I only define half of the function, since the range I'm looking at goes from 0 to 6?
Even if I do this though (meaning: I only use half of a normalised Gaussian, which then has an area of 0.5), the area of my final convolved function is something like 857, way too high. What could be happening here?
For the actual convolution my code looks as follows; n_z = 121 ( because xmax = 6 in steps of 0.05), g(i) is the above definition, f(j) is my probability distribution, and int in the code is my attempt to calculate the area of the final result.
I'm also not sure about the i-j not being able to be below zero; that was just my thoughts because I only defined both functions for values greater than zero.

c Perform the convolution
do i = 1, n_z
do j = 1, i

c Avoid range violation: i-j cannot be below zero
if (i-j.le.0) then
conv(i) = conv(i) + (f(j) * 0.0)
else
conv(i) = conv(i) + (f(j) * g(i-j))
end if

end do

int = int + (conv(i)*0.05)

end do

Any help appreciated a lot - thank you!!

2. Originally Posted by megana
Hello everyone,

I need help with some maths/programming problem...
I try to smoothe a discrete probability function (of a data set) with a Gaussian distribution. I understand that I have to use a convolution:
(g*f)(m) = sum(n=1 -> m) f(n)g(m-n); this is a sum in my case because it's a discrete convolution. The function f(x) is my original probability distribution, the function g(x) is a Gaussian distribution.

I'm getting stuck with how to actually make the convolution work... it seems easy, but I seem to be getting too big numbers.
More detail on my original distribution: it is defined for 0 <= x <= 6 in steps of 0.05 (I've got a concrete number at each step) and it integrates to an area of 82 (that's my total number of data objects). Now I want to convolve it with a Gaussian with sigma = 0.4, and I use
g(i) = 1/(0.4 * sqrt(2.0*3.1415926))*exp(-((x(i)**2)/(2* (0.4**2))))
which should be normalised to 1.

First question now: where does the centre of the peak of the Gaussian have to be - I think it shouldn't matter, should it; but my program spits out different numbers if I shift the peak. I think however that if I leave the peak at zero, then I only define half of the function, since the range I'm looking at goes from 0 to 6?
Even if I do this though (meaning: I only use half of a normalised Gaussian, which then has an area of 0.5), the area of my final convolved function is something like 857, way too high. What could be happening here?
For the actual convolution my code looks as follows; n_z = 121 ( because xmax = 6 in steps of 0.05), g(i) is the above definition, f(j) is my probability distribution, and int in the code is my attempt to calculate the area of the final result.
I'm also not sure about the i-j not being able to be below zero; that was just my thoughts because I only defined both functions for values greater than zero.

c Perform the convolution
do i = 1, n_z
do j = 1, i

c Avoid range violation: i-j cannot be below zero
if (i-j.le.0) then
conv(i) = conv(i) + (f(j) * 0.0)
else
conv(i) = conv(i) + (f(j) * g(i-j))
end if

end do

int = int + (conv(i)*0.05)

end do

Any help appreciated a lot - thank you!!
You seem to be making hard work out of this. Let $g(x)$ be the pdf of a zero mean Gaussian distribution with SD of 0.4.

Then you want:

$h(x)=\sum_{idx=1}^m f_{idx}g(x-x_{idx})$

where $x_{i}$ is the location of the $i$-th point and $f_i$ is the value there.

But note that we have now gone from a discrete distribution to a continuous one, and:

$\int_{-\infty}^{\infty}h(x) = \sum_{idx=1}^m f_{idx}$

and if you wish to approximate this integral by a sum over a equispaced set of $x_j$'s you have to multiply the sum by the spacing between the points:

$\int_{-\infty}^{\infty}h(x) \approx |x_1-x_2| \sum_{idx} h(x_{idx})$

and the sum should extent over a range containing most of the distribution (say from the smallest $x$ at which $f$ is defines $-3 \sigma$ to the largest such $x$, $+3 \sigma$)

RonL

3. Hmmm... so far so good...

It looks like I'm doing the right approximation of the integral then. I'm still wondering why the numbers I get are way too high - there really isn't much that can go wrong here?

I figured that range violation thing is probably wrong, since the Gaussian should be well defined everywhere. I'll keep trying...

Thanks!

4. I've solved the problem, thanks!

Obviously, using the radius of 0.4 was wrong if I have 121 steps, but stepsize 0.05 (so my range goes from 0 to 6).
I have to use 0.4 / 0.05 as radius, so the ratio stays correct - and now it all works and I've got a beautiful plot!

As I thought... actually more a programming bug than a maths problem!