# Thread: Estimating lognormal parameters from a truncated empirical distribution

1. ## Estimating lognormal parameters from a truncated empirical distribution

I have a set of 50 values that I assume are part of a longnormal distribution. However, the values I have are a biased sample, in the sense that they are all under 1,000. (In other words, I do not have any information on how many values could or should exceed 1,000, or what those values may be.)

I know that if I had a "full sample" I could calculate the mean and variance of my sample and back into the relevant mu and sigma parameters, but is there a way to estimate the full distribution's mu and sigma parameters using some kind of limited expected value and limited variance function?

- Steve J

2. Originally Posted by Steve_J
I have a set of 50 values that I assume are part of a longnormal distribution. However, the values I have are a biased sample, in the sense that they are all under 1,000. (In other words, I do not have any information on how many values could or should exceed 1,000, or what those values may be.)

I know that if I had a "full sample" I could calculate the mean and variance of my sample and back into the relevant mu and sigma parameters, but is there a way to estimate the full distribution's mu and sigma parameters using some kind of limited expected value and limited variance function?

- Steve J

Do you know how the data were/are censored?

CB

3. All the values above the threshold were just completely ignored.

4. Originally Posted by Steve_J
All the values above the threshold were just completely ignored.
It should be possible to compute the conditional mean and variance of censored log-normal data, thogh it might be easier to do this numerically rather than analytically, but that should still be sufficient to do the fit (though this is not a routine task)

The mean of the truncated distribution is:

$\displaystyle \mu^*=\frac{\int_0^{1000} x p(x)\;dx}{\int_0^{1000} p(x)\;dx}$

where $\displaystyle p(x)$ is the density (in this case the log-normal density) of the un-truncated distribution

and the variance of the truncated distribution is:

$\displaystyle \sigma^*=\frac{\int_0^{1000} (x-\mu^*)^2 p(x)\;dx}{\int_0^{1000} p(x)\;dx}$

and the task is to find the parameters of the underlying log-normal that when truncated give the same mean and variance as your actual data.

CB

5. Thanks, CB, but I think I'm still not 100% sure of what to do next. I believe the formulas you've outlined below are broadly applicable across all distributions; I'm not sure what the specific form of the equations will look like when the lognormal distribution is censored as described, nor exactly how to derive it.

Again, I appreciate your help, and hope you can help get me the rest of the way there.

- Steve J

6. Originally Posted by Steve_J
Thanks, CB, but I think I'm still not 100% sure of what to do next. I believe the formulas you've outlined below are broadly applicable across all distributions; I'm not sure what the specific form of the equations will look like when the lognormal distribution is censored as described, nor exactly how to derive it.

Again, I appreciate your help, and hope you can help get me the rest of the way there.

- Steve J
Do you have the mean and variance of your sample?

The density of the log-normal distribution is:

$\displaystyle p(x,\mu,\sigma^2)=\frac{1}{x\sigma\sqrt{2\pi}}e^{-\frac{(\ln(x)-\mu)^2}{2\sigma^2}}$

So now we can (numerically at least) determin the mean and variance of the truncated log-normal using the formulas I posted earlier.

An alternative is to compute the mean and variance og the log of your data and fit a truncated normal distribution.