# Thread: Poisson - Finding and Removing outlier

1. ## Poisson - Finding and Removing outlier

Hi !

Sometimes I have lot of "zeroes" sometimes not.
It's for practical sampling, with a lot of figures, so I'd
rather have an easy enough approach then a perfect but complicated solution.
Instruments sometimes give unexpected values quite off.

Do I have to solve
a) upper bound: P(x>cutoff) = 0,003 (setting same level as Std3 in a Norm)
Is there a faster/better rule of thumb here?
It's for a practical point of view so it doesn't have to be perfect.

b) And how do I set lower bound? seems even more tricky!

c) When I found an outlier in a rule I set up,
How do I go about it? Do I remove it from my sample-set completely
directly when I discover it, or do I keep it if it
is within expected level of probability?

Hope you can help!
Thanks!

2. ## Re: Poisson - Finding and Removing outlier

Hey laban1.

Can you outline your sample properties? What is your distribution (assumed)? What are you using for outlier detection? (Cooks distance, something else maybe)?

3. ## Re: Poisson - Finding and Removing outlier

- Poisson (as in title), Lambda = np is fairly stable in each of the studied Areas.
but Lambda can very from very small to very large in different Areas.

What are you using for outlier detection? (Cooks distance, something else maybe)?
Cooks distance? Don't know. Suppose it could be a multiple of sdt?? Hence my question.

4. ## Re: Poisson - Finding and Removing outlier

There is an attribute in Poisson modelling called over-dispersion.

I think you should check it out and use your favourite software package like R or SAS to estimate the over-dispersion co-efficient and incorporate it into your analysis.

You could throw away the 0's if you have good enough justification to throw them out, but if that is not the case then take a look at over-dispersion and consider looking at other similar methods to account for this skewed behaviour in the Poisson.

Overdispersion - Wikipedia, the free encyclopedia

I can't tell you what to do with the data in terms of throwing out outliers or censoring data, but I do know that for your kind of problem, over-dispersion analyses is a good first start.

5. ## Re: Poisson - Finding and Removing outlier

Hi!

I have basic knowledge of over-dispersion. I was more into outlier.

Any input?

7. ## Re: Poisson - Finding and Removing outlier

You will have to decide whether the outliers are justified to be thrown out or whether you have to use something like over-dispersion.

We don't know enough about the context of the data to answer that for you.

8. ## Re: Poisson - Finding and Removing outlier

Hi! What do you need to know?

9. ## Re: Poisson - Finding and Removing outlier

You should decide first of all whether the outliers should stay or be thrown out.

To do this, you need to figure out whether the outliers are representative of the data and what you are trying to answer or if they are not.

10. ## Re: Poisson - Finding and Removing outlier

Originally Posted by chiro
We don't know enough about the context of the data to answer that for you.
Originally Posted by laban1
Hi! What do you need to know?
From the post just before yours, it appears chiro is asking for the context of the data. The more information you provide about the data, the better our understanding of it will become. The question "what do you need to know?" is difficult to answer from this end. We have no access to the data. We have no feel for what the data looks like, where it came from, how it is being evaluated, the circumstances of how it is obtained, potential expected causes for outliers (and possible methods for detecting them), potential dependencies that could be examined to rule out outliers, the level of accuracy you want in your findings, etc.

Did you read up on Cook's Distance as chiro suggested? You could also look at Identifying outliers. Check and see if any of those methods seem suitable for your model. The section below titled "Working with Outliers" might also be of interest. You asked if you should delete outlier data. That practice is frowned upon.

11. ## Re: Poisson - Finding and Removing outlier

Sorry for the late feedback

It seems pretty advanced, for me, Cooks distance and all !
Just to choose method as you point out, is not even given.

I'm playing with an option to expand my measuring time so that instead of getting Lambda = 3
for one hour, I could sum 4 hour and get Lambda = 12 and thus I would get approx Norm
and could apply "3std-rule".
I have quite a lot of data, and I expect the process to be fairly stable (Lambda stable) over time.

Would that be an option to consider?
Pros and cons?

Thanks!

I'm measuring waterflow during time of least expected flow, 1hour btw 03-04 at night.
I have different flow-meaters and I expect there are calibration-problems as well.

12. ## Re: Poisson - Finding and Removing outlier

Again you have to decide whether its justified to take out the outliers or not.

If you can't justify it, then you will probably have to resort to something like over-dispersion or a more general form of model.

13. ## Re: Poisson - Finding and Removing outlier

Originally Posted by laban1
Sorry for the late feedback

It seems pretty advanced, for me, Cooks distance and all !
Just to choose method as you point out, is not even given.

I'm playing with an option to expand my measuring time so that instead of getting Lambda = 3
for one hour, I could sum 4 hour and get Lambda = 12 and thus I would get approx Norm
and could apply "3std-rule".
I have quite a lot of data, and I expect the process to be fairly stable (Lambda stable) over time.

Would that be an option to consider?
Pros and cons?

Thanks!

I'm measuring waterflow during time of least expected flow, 1hour btw 03-04 at night.
I have different flow-meaters and I expect there are calibration-problems as well.
Any input on this?

14. ## Re: Poisson - Finding and Removing outlier

You may not necessarily get normality by using a higher rate.

You should look at asymptotic results, particular with regard to the deviance statistic (which is a chi-square statistic) if you have a big enough sample.

15. ## Re: Poisson - Finding and Removing outlier

Originally Posted by chiro
You may not necessarily get normality by using a higher rate.