Maximum Likelihood Estimation error calculations

Aug 2016
3
0
Moscow
I have a question about calculating standard error in MLE.
For the simplicity, I will describe the idea on a simple sample, but in real life I have a problem, which can be solved only with MLE.

Consider we have 1000 integer numbers (so-called outcomes) generated with Poisson distribution with some lambda /let's say lambda_true=10/.
There are various ways to estimate the lambda:

1) Build histogram indicating frequency of each outcome. Then fit it with Poisson. The initial data contains enough data to have a good fit. /Let's say, the estimation will be something like lambda_fit=9.8+-0.3/ Fine!

2) Build MLE function depending on a single variable lambda_MLE:

P = p1(lambda_fit)*p2(lambda_fit)*...*p1000(lambda_fit),

where pi(lambda_fit) is probability to observe outcome pi at given value lambda_fit.
Then we maximize P (of logP, it doesn't matter). /Let's say, We get something like lambda_fit=10.1/

Then how do we calculate an error? We read something like this: http://www.stat.umn.edu/geyer/old03/5102/notes/fish.pdf
and then calculate second derivative d^2(P)/d(lambda)^2, then divide it by n=1000 to get Fisher information I(lambda), then calculate var:
var(lambda_MLE)=-1/I(lambda).
The problem is that i get huge var(lambda) like lambda_MLE itself /for example, lambda_MLE = 10.1 +- 12/

I get this much error, because MLE function is very unsensitive to the change of lambda. It is, because most each particular outcome has low probability in Poisson, and thus P is multiplication of something like 0.1x0.05x0.1x0.2x... for lambda = 10. As the average probability is ~10%, lambda shift doesn't make it much lower or higher - the sensitivity is low, the second derivative is low, the var is high.

To conclude: we have nice low-error lambda when fitting histogram, but we have large error of lambda when calculate error through Fisher information.
The question is: what am I doing wrong? I believe that in correct way these two method must give the same answer not only for estimated lambda, but also for it's error. How to match these results?
 

chiro

MHF Helper
Sep 2012
6,608
1,263
Australia
Hey Vurdilla.

I think you should show your working for deriving the MLE based estimator and the distribution that it has for more information.

When you get an estimator you have get its test statistic which is a distribution corresponding to your MLE (which is a random variable as it is a function of random variables itself).

For lots of information in a sample, you can use the Central Limit Theorem.
 
Aug 2016
3
0
Moscow
Dear chiro, thanks for the rapid response.
Well, let's track it on the sample.
Consider Poisson distribution with lambda=5. In this file, I've calculated probabilities of various outcomes for this distribution:
https://www.dropbox.com/s/0zyu5xq4l0ydvqb/poissonTest.xlsx?dl=0
"Amount of samples" column shows amount of outcomes written counted and written in another file (just if I register it by measurements). File with outcomes is here:
https://www.dropbox.com/s/ao0b4p9ntf22n7z/outcomes.txt?dl=0

Then I did two things:
(1) Loaded histogram (columns B and E in Excel file) to Origin and fitted the histogram with Poisson distribution [y=y0+e^(-r) * r^x / (x!)] woth zero shift y0. The results are: lambda = 4.997
+-0.002, which is very nice.

(2) Calculated numerically logP for lambda from 2.5 to 7.5, the data is here:
https://www.dropbox.com/s/jvp002h9i8xnwun/logP.txt?dl=0

lambda_max = 4.98193, which is fine.
logP(lambda=4.98193) = -2188.74 which seems reasonable (average probability ~0.11)

Then opened the data in Origin again, built a plot and fitted it with second order polynomian, which gives the second derivative
d^2 logP / dLambda^2 = -102
Thus Fishers information is
I(lambda) = -1/N * d^2 logP / dLambda^2 = 0.102, as N=996.

Thus
var(lambda) = 1/I(lambda) ~ 10
This value doesn't make any sense!

You can see all the initial and derivative data now. Can you tell what am I doing wrong?
 
Aug 2016
3
0
Moscow
PS IPS I've used pretty typical MLE P for Poisson:
P(lambda) = p1*...*pN,
pi(lambda, ni) = lambda^ni * exp(-lambda) / ni!,
where ni is i-th outcome.