# Thread: Calculating Probability of Value with Historical Data.

1. ## Calculating Probability of Value with Historical Data.

Hello All,

Yes I am a first time posting wanderer. But I really need some help.

I'm a programmer and I designed a report for monitoring a set of alarms with boolean values(true/false). I know basic math, and I've stuck to the "divide anomoly by total = probability". But I think this equation requires more variables than that. Like I said...i know very very basic math.

These alarms activate down to a millisecond of the boolean change and store within a historical database, and stand until cleared back to the original position. Let's say 0 false, 1 true. - 1 is good, so 0 is bad.

I'm not being simplistic to insult anybody, I just really want to make this clear for anyone reading it.

Alright so here is how the program proceeds.

I have 3 months worth of data, with my majority of wanted data needing to be in the 1 boolean position. BUT calculating and extracting millisecond by millisecond of data within the database is impossible(atleast processor time extensive), not to mention on a larger scale, this would make monitoring 2000 other alarms inconceivable!(princess bride ).

So, I pulled back the algorithm to only extract a minute by minute basis of the data from within the database. The algorithm then looks at the closest minute and rounds up the seconds/milliseconds to the next minute. (13:13:13.124 = 13:13:00);

Alright, now I've got a minute by minute record of the boolean values for 3 months worth of data, I totalled the minutes for the 3 month periods, and scanned the data log for the initiation and exit of the false variable. Like this:

9:01 - true
9:02 - false (anomoly minutes + 1)
9:03 - false (anomoly minutes + 1)
9:04 - false (anomoly minutes + 1)
9:05 - false (anomoly minutes + 1)
9:06 - true (anomoly minutes + 1) - To display that the alarm technically had a minute extra of false status.

Alright, so I take the anomoly minutes (AM) by the total minutes (TM) and get the probability(P). AM/TM = P And here's the problem....every single time, the value is >= 96.4%.

By pulling back time to hours or half hours I sacrifice authenticity and rational data. But I also look like my report spits out this 96.4 percentage status as an overactive programmable tourettes syndrome.

Am I approaching this problem in the wrong directional, completely not noticing some variable? Some step I completely missed?

I appreciate any help you can give me and thanks for sticking through this painful act of crappy mathematical equations.

And finally, nice to meet you!

2. ## Re: Calculating Probability of Value with Historical Data.

In your example you have 5 anomoly minutes out of 6 minutes total, for a ratio of 83.3%. It does overstate the number of anomoly minutes because of the way you round the start and stop times to the nearest minute and then throw in an exrtra minute. The alarm might have actually been anywhere from 3 to 5 minutes long. For example, if the alarm came on at 9:02:29 you round down to 9:02; and if it goes off at 9:05:31 you round up to 9:06. So the actual time of the alarm may be only 3:02, yet you've rounded it all the way up to 5. The longest it could have possibly been is from 9:01:31 to 9:06:29, or 4:58 (just shy of 5 minutes). As long as you collect data on a lot of alarms the rounding will wash out, but you shouldn't be adding that extra minute. That way this alarm would be reported as 4 minutes, not 5, which is the average of the extremes.

3. ## Re: Calculating Probability of Value with Historical Data.

Hmm... Interesting thoughts. Alright so this is the only way I could think of completing your suggestion programmatically, let me know if you think i'm on the write track.

So I have 7 entries.

1: 8/12/11 9:00:29 True
2: 8/13/11 3:31:00 True
3: 8/14/11 14:58:49 False
4: 8/15/11 16:32:04 False
5: 8/16/11 2:29:15 False
6: 8/17/11 8:26:38 True
7: 8/18/11 17:08:56 True

Calculating on a second by second basis while not sacrificing processing would just be more multiplication than anything.

Between True to True Variables, I just multiply the minutes times 60 when it's x to y values and subtract the difference when it's x to y values.

For example with True to False Variables. I Timestamp the false variable for later equation. and subtract the time between the false variable and the true variable to get the True status differential length.

Between the False to True Variables. I timestamp the true variable, and subtract the false variable timestamp to get the difference, and add it onto the False Variable Total.

So.....

Begin Timestamp = BTS
End Timestamp = ETS
True Timestamp = TTS
False Timestamp = FTS
True to True = T2T
True to False = T2F
False to True = F2T
False to False = F2F
False Total = FT
Total = TTL
Probability = PB

Getting Total: ETS-BTS = TTL
Getting False: ((T2T*60) + (T2F{FS-TS}) - (F2T{TS-FS}+(F2F*60))) = FT
Getting Probability: FT/TTL = PB

On the right path?

Thanks so much for the good tips!

4. ## Re: Calculating Probability of Value with Historical Data.

I think it's much simpler, so maybe I just don't understand what you're trying to do. The duration of the FALSE signal is F2T-T2F. So the percentage time during which you have FALSE is estimated by (F2T-T2F)/TTL. Also, I don't understand what your variables FS and FT are?

5. ## Re: Calculating Probability of Value with Historical Data.

If i subtract the t2f, im subtracting the length of time that the variable was in true status. If t2f = 9:02:03(true) - 9:02:30(false) thats 27 seconds of true status. Thats why FS comes in. It sets the initiation of the false variable. And FT is just the total false variable.

6. ## Re: Calculating Probability of Value with Historical Data.

Ahh...i see what i did, i messed up the False Time Stamp and the True Time Stamp...

here's the revised equation.

Getting Total: ETS-BTS = TTL
Getting False: ((T2T*60) + (T2F{FTS-TTS}) - (F2T{TTS-FTS}+(F2F*60))) = FT
Getting Probability: FT/TTL = PB

7. ## Re: Calculating Probability of Value with Historical Data.

I still don't get what you're trying to do, sorry. let me give yuo an example and then yuo can tell emn what yuo tghink the answer ought to be.

at t=0, status = true
t= 1:05, status changes to false
t = 3:10, status changes to true
t = 5:00, end of data collection

I would think you would calculate (F2T-T2F)/TTL = (3:10-1:05)/5:00 = 125seconds/300seconds = 0.417. Is that not what you're looking for?

Also - your formula seems to have mixed up units, which is a hint that something's wrong. T2T*60 is in units of seconds, but T2F(FTS-TTS) is units of seconds squared.

8. ## Re: Calculating Probability of Value with Historical Data.

Hmm you are right, seems im over complicating it.

T: 3:00 - true
T: 3:01 - false - save false timestamp
....
T: 3:04 - true - save true timestamp

TTS - FTS = false time period/ total time

... Right? Thanks so much

9. ## Re: Calculating Probability of Value with Historical Data.

Originally Posted by rich2600
TTS - FTS = false time period/ total time

... Right? Thanks so much
Yes - that's how I would do it.