New here and hope im posting in the correct place!
anyway, just need some help with dealing with outliers.
What i am trying to accomplish is, out of a series of values, to find an outlier. But i want to find an outlier that is very large.
As an example, i have the values : 2.1, 2.3, 2.5, 2.6, 2.8, 2.9, 2.9
Now, the IQR is very small. The problem i have is, say the next number is 20, for what im dealing with, that is actually acceptable. Even though it is an outlier, it is acceptable jump in value ( because these values actually involve power consumption).
The issue is, im trying to find a way on how to treat the value 20 NOT as an outlier, but an acceptable value. Yet, if the value was say 100, then that should be called an outlier.
Am i using the wrong approach with using IQR? Any suggestions?
Hope this makes sense!
Have you considered a transformation? Are there any negative values?
ln(2.1) = 0.742
ln(20) = 2.996
ln(100) = 4.605
No, none of them are or will be negative. May get some zero's but thats it.
Might have a look at transformations, cheers
Even with the log transformation, ln(20) and ln(100) will be outliers. Maybe another transformation will work, but you might also want to consider something other than the interquartile range, too.