Intersection between histograms?

Hi all,

Not sure best how to describe this in short; I think "intersection between histograms" is the best way I can think of, but here's what I'm looking for.

I have a software tool that calculates visibility of an object in different weather conditions. The tool looks at an image and produces a "level of visibility" decimal number, anywhere from 0 to 6. The higher the number, the more visible the image. In order to make the tool produce a binary decision (I can see it or I cannot see it), I use a value entitled the "Threshold". For any image, if the "level of visibility" is greater than the "threshold", I can see the image, otherwise I cannot (i.e. if the "level is visibility" is 4.4, and my "threshold" is 4.1, I can see it).

Now I am producing a training software for the tool. I have a set of "truth data" where I have classified each image as either visible or not. By using the truth data, I am attempting to determine the optimal "threshold" for a set of images. I have seen that the "level of visibility" calculated is not always above the "threshold" for images that are in truth visible, and vice versa.

So, in mathematical terms: I have two sets of decimal numbers between 0 and 6. Set A has mostly numbers closer to 0, but some all the way up to 6. Set B has mostly numbers closer to 6 but has some all the way down to 0. Set A has a lower average than Set B. I need to find a single value X which produces the greatest possible result for the following condition: The number of values in Set A that are less than X + the number of values in Set B that are greater than X.

I don't think I'm looking for the median of either set, and definitely not the average. I suspect that what I might be looking for is the median of the intersection of the two sets, but I can't see why yet.

Please help!

Re: Intersection between histograms?

Quote:

Originally Posted by

**supaju** Hi all,

Not sure best how to describe this in short; I think "intersection between histograms" is the best way I can think of, but here's what I'm looking for.

I have a software tool that calculates visibility of an object in different weather conditions. The tool looks at an image and produces a "level of visibility" decimal number, anywhere from 0 to 6. The higher the number, the more visible the image. In order to make the tool produce a binary decision (I can see it or I cannot see it), I use a value entitled the "Threshold". For any image, if the "level of visibility" is greater than the "threshold", I can see the image, otherwise I cannot (i.e. if the "level is visibility" is 4.4, and my "threshold" is 4.1, I can see it).

Now I am producing a training software for the tool. I have a set of "truth data" where I have classified each image as either visible or not. By using the truth data, I am attempting to determine the optimal "threshold" for a set of images. I have seen that the "level of visibility" calculated is not always above the "threshold" for images that are in truth visible, and vice versa.

So, in mathematical terms: I have two sets of decimal numbers between 0 and 6. Set A has mostly numbers closer to 0, but some all the way up to 6. Set B has mostly numbers closer to 6 but has some all the way down to 0. Set A has a lower average than Set B. I need to find a single value X which produces the greatest possible result for the following condition: The number of values in Set A that are less than X + the number of values in Set B that are greater than X.

I don't think I'm looking for the median of either set, and definitely not the average. I suspect that what I might be looking for is the median of the intersection of the two sets, but I can't see why yet.

Please help!

This is a classifier training problem by the sound of it, but I think you either have not quite understood what you want, or have not presented it clearly.

CB

Re: Intersection between histograms?

Thanks CaptainBlack,

I think you're right. I'm not sure how to explain the problem clearly (which usually means I don't understand it fully). For now I've circumvented the issue entirely and implemented my trainer in a different way.

If anyone happens to think they understand what I'm asking, please comment.