I know that this may seem like the old "average of an average" terrible idea but I really don't think it is. Stick with me!
I have a very large set of data in a database table and I am tracking how many records in each field are blank or null over time by month. So let's say I have data something like this:
I have pre-aggregated these millions and MILLIONS of records into a table that looks like this below. It's absolutely not feasible to compile the statistics at reporting time because of the size of the data. So the rates would be as such:
Month Field1 Field2 Field3
1 NULL A 100
1 NULL NULL 110
1 NULL B 120
2 NULL C 100
2 NULL D NULL
2 NULL E 120
2 NULL F 150
Hopefully that makes some sense.
Month 1 Month 2
Field1 - 100% 100%
Field2 - 33% 0%
Field3 - 0% 25%
So what I want to do is visually represent each field over time as a line graph. I'm simplifying a bit here. I actually have about 20 fields and will have 20 lines on this graph. Most of the rates are 0%, a few are 100%, some are in between. The point of the graph is not to monitor the actual percentages per se, but rather to monitor trends. I'm more interested in instances where Field1 is around 25% every month for 10 consecutive months and then all of a sudden it's 80%. I want to graphically see that as a big spike.
The problem is that it will be visually difficult to follow each line over time and look for spikes or dips individually. If I could "normalize" each of them so that they all kind of "fit" on a similar line, like starting out at the same point, and then going up or down from there, that would really help. For instance, you see this frequently if you compare a stock's performance to the Dow and NASDAQ composite. Check out this link.
Google Inc: NASDAQ:GOOG quotes & news - Google Finance
So can anyone tell me how to take a table of all of these percentages look "normalized" so that it's visually easy to spot abnormalities in trends for any given field?
Thanks for sticking with me!
Re: Comparing averages
Are you aware of the field of Time Series? Time series analyses look at varying kinds of normalization and including things like Moving Averages.
If you aren't that familiar I suggest you get a resource and look into the kinds of transformations and trend features present in the discipline (which is a part of statistics and probability)