Results 1 to 8 of 8

Math Help - CDF of generalized gaussian distribution

  1. #1
    Newbie
    Joined
    Dec 2008
    Posts
    4

    CDF of generalized gaussian distribution

    What would be the expression for the cumulative distribution function of the generalized gaussian distribution. The PDF of distribution is given by:
    <br />
f(x)=a e^{-|bx|^c},<br />
    where
    <br />
         a=\frac{bc}{2\Gamma(\frac{1}{c})}<br />
    and
    <br />
        b=\frac{1}{\sigma_x} \sqrt{\frac{\Gamma(\frac{3}{c})}{\Gamma(\frac{1}{c  }) }}<br />

    Thanks
    Last edited by dreamer1; December 11th 2008 at 11:11 AM.
    Follow Math Help Forum on Facebook and Google+

  2. #2
    Member
    Joined
    Jul 2008
    Posts
    138
    \int^{x}_{-\infty}a\exp(-|b t|^c)\, dt

    After some googling, I have not found anything better than that.

    You can write down the expression in terms of the generalized error function (see this) but in the end you still have the same integral at the heart of it. The fact is that you cannot find an expression that doesn't contain either an integral or an infinite series for the general case. For certain values of c (like 1), you could obviously write down a closed form expression (though I think it might have to have to be broken up into 2 expressions, one if x<0 and one for x\geq 0)

    Even if you look at c=2, which is Gaussian (see this) they write the CDF in terms of the error function which is an integral expression.
    Follow Math Help Forum on Facebook and Google+

  3. #3
    Newbie
    Joined
    Dec 2008
    Posts
    4
    Thanks for the reply.
    I also got the same integral:
    <br />
F(x)=\int^{x}_{-\infty}f(x)dx=\int^x_{-\infty}a\exp(-|b x|^c)\, dx<br />

    but didn't know (and still don't) what do with it

    Actually, I'm trying to get goodness-of-fit of the empyrical data to a GGDs with different shape parameters c.
    The Kolmogorov-Smirnov test needs the empyrical F_x(t) and the distribution CFD F(t).
    In Matlab (and in general) it is easy to find the empyrical CDF of the given data and evaluate it at each sample, but how do I get the value of the GGD CDF?
    Follow Math Help Forum on Facebook and Google+

  4. #4
    Member
    Joined
    Jul 2008
    Posts
    138
    What you need to do is numerically evaluate the integral. There are a few ways of doing this in matlab. There is the "quad" family of functions. I have had problems with them when you want infinite bnds and it won't necessarily create a pdf that is monotonically increasing (since it approximates the function, and then integrates).

    So probably the best way (?) is just numerically sample the function over reasonable bnds and a small spacing. Then just do a trapazoidal numeric integration.

    So first figure out some reasonable bnds. From your formulation, you seem to know a priori what the standard deviation is \sigma_x.

    So let

     B = 10\, \sigma_x

    By Chebchev's inequality, you are guaranteed to miss at most 1% of the total area under the curve by using this as a bound. For most distributions, it is substantially better than that. You may want to crank that sucker down to 4 or 5, say, rather than 10.

    Code:
    x = linspace(-B,B,10000);
    pdf = a*exp(-(b*x).^c);
    
    % perform trapazoidal cumulative integration
    cdf = cumtrapz(x,pdf);
    You will be able to tell how well you did by looking at 1-cdf(end). If that is very small, then chances are you have a good sampling of the pdf. If you don't want your cdf to be quite that big you still need to calculate the cdf over a big range and small spacing (as I have done) and then you can downsample.

    For example:
    Code:
    x_small = -B:0.05:B;
    cdf_small = interp1(x,cdf,x_small,'linear','extrap');
    Follow Math Help Forum on Facebook and Google+

  5. #5
    Newbie
    Joined
    Dec 2008
    Posts
    4
    Great, the first code snippet was exactly what I needed!

    As for the 1-cdf(end) part, I'm not sure you are correct. The KS test searches for the
    <br />
max |F_x(t) - F(t)|,<br />

    which is probably somewhere near the middle of the 0-0.5 or 0.5-1 ranges of cdf values.
    In general, if my cdf is anything even close to gaussian it should have no problem to come very close to 1 at cdf(end), and I expect the 1-cdf(end) to always be (for a reasonable paramaters of GGD) very close to 0. Please, correct me if I'm wrong.

    For the last advice on the topic (or a bit off topic), the \chi^2 test needs distributions pdfs. I suppose it should be fine to use
    Code:
    [pdf,x]=ksdensity(Y);
    to estimate the pdf of the values in Y?
    Follow Math Help Forum on Facebook and Google+

  6. #6
    Member
    Joined
    Jul 2008
    Posts
    138
    Quote Originally Posted by dreamer1 View Post
    As for the 1-cdf(end) part, I'm not sure you are correct.
    Sorry, this was meant to be a check just on how good the numerical integration approximation was. We are discretely sampling the PDF, and then doing Riemann Sums as the approximation to the integral to get the CDF. If we undersampled the PDF then cdf(end) may not be very close to 1. I wasn't referencing the kstest.

    The last part of my post was referring to the fact that maybe you didn't want to have such a finely resolved CDF. If that was the case, then I was showing how you might downsample it.

    I assume your Y is the data? I guess the tests that I am familiar with \chi^2 you don't need to do any kernel smoothing of the data. You would just bin the data and "bin" the PDF (take the difference of the endpoints of the CDF for each bin), and do the \chi^2 test. It doesn't look like the ksdensity would be necessary.

    If Y is the CDF that we just calculated, I'm not sure why any kernel smoothing would be necessary either.

    But I should add the disclaimer that I have not done much on this part of statistics. I have used both kstest and \chi^2, but I have never done any kernel smoothing. I would think kernel smoothing would be useful for visualization, but not really for trying to perform hypothesis tests comparing empirical data to a given distribution.
    Follow Math Help Forum on Facebook and Google+

  7. #7
    Newbie
    Joined
    Dec 2008
    Posts
    4
    Sorry, this was meant to be a check just on how good the numerical integration approximation was...
    I misunderstood you. Now it makes sense.

    For the \chi^2 test, you are, of course, right again. Binned data is what is used in the test so the cdf differences will do it.

    Thanks for the help, I think I've finally got things straightened out
    Follow Math Help Forum on Facebook and Google+

  8. #8
    Member
    Joined
    Jul 2008
    Posts
    138
    Excellent
    Follow Math Help Forum on Facebook and Google+

Similar Math Help Forum Discussions

  1. Student T-Distribution or Gaussian Distribution
    Posted in the Advanced Statistics Forum
    Replies: 3
    Last Post: September 13th 2010, 01:58 PM
  2. Replies: 1
    Last Post: June 10th 2010, 04:13 AM
  3. Gaussian distribution
    Posted in the Advanced Statistics Forum
    Replies: 1
    Last Post: December 8th 2009, 08:39 AM
  4. Is the distribution Gaussian
    Posted in the Advanced Statistics Forum
    Replies: 2
    Last Post: November 5th 2009, 04:17 AM
  5. Gaussian Distribution
    Posted in the Advanced Statistics Forum
    Replies: 1
    Last Post: April 27th 2007, 01:42 PM

Search Tags


/mathhelpforum @mathhelpforum