# Thread: Comparing two sets of results

1. ## Comparing two sets of results

This ought to be an easy question but I am struggling. I have a table of data (let's say x and y) calculated to high precision. I have another table of data calculated differently using lower precision maths (actualy using an 8-bit processor). Significant rounding errors are present in the latter as might be expected. x is always the independent variable.

To compare the two data sets I take the differences of each dependent variable and plot these in Excel against the independent variable. The errors (differences) appear in the 5th, 6th+ d.p. etc.

If the errors in the second calculation are truly random the mean of the differences should be 0. Sometimes the mean is close to zero, sometimes there is a definite +/- bias. Occasionally there is a trend in the differences of the form f(x) = mx + c yielding an intercept and a gradient.

What I need to know is how to represent the results. If the first calculation is the "standard" then the second (lower precision) method needs to be quoted to some degree of accuracy compared to the first. At the moment I have been content with the mean of the differences (which should always be zero but is seldom so) and the correlation coeffiecient of those differences (which should also be zero, showing no relation whatsoever) but seldom is.

This is not about reducing those differences but about how best to quote the accuracy with which one set of data matches a "standard". Any suggestions?

Thanks,
Ric

2. ## Re: Comparing two sets of results

Hey ricm.

One quick question: is the process a complex process where rounding down is done many times at intermediate steps for an unknown process or is the rounding done at th end of the process?

3. ## Re: Comparing two sets of results

The process is a series of calculations where rounding up and down occurs at every stage. Occasionally errors will seem to cancel, at other times reinforce each other. Two parts of the calculation are very sensitive to rounding errors and, given that both are at the start of the process, errors will propagate throughout subsequent calculations. The problem is one of performing calculations near the precision limits of the hardware and maths library.

This is the nature of the beast. What I am aiming for is to quote the 8-bit processed results to some degree of accuracy against a standard.

Hope this helps.
Ric

4. ## Re: Comparing two sets of results

So would it be safe to say that the error will have some kind of symmetric distribution (like a normal) around zero and you want to see if you get this kind of thing as well a what a good idea of a particular interval (corresponding to some probability) around zero would be?

The first thing would be to get a distribution of the residuals and from what you have posted, see how these compound to give the eventual differences.

Once you get a distribution with a big enough sample size, you can look at any interval you want whether it's symmetric about zero or even whether it's the tighest interval for a particular probability.

What I would instead recommend you do if you can is to use a Monte-Carlo approach.

If you have the algorithm, then what you should do is basically simulate say 10,000 or many of these of initial values from some distribution for each input and then generate a distribution for the output. If you do this, you will get a much better idea of what is going on and you can capture the results at each step of the way.

5. ## Re: Comparing two sets of results

So would it be safe to say that the error will have some kind of symmetric distribution (like a normal) around zero and you want to see if you get this kind of thing as well a what a good idea of a particular interval (corresponding to some probability) around zero would be?
For the most part, yes, This is what I would expect. However there are occasions when the distribution is not symmetric. Perhaps two examples will illustrate what the data is doing. Attachment 1 (IMG_0001) shows the differences between two approaches and the "standard" (a calculation performed in Excel). There is little differences between the two sets of results except where the integer and fractional components are dealt with. (Where precision needs to be maintained in the nth decimal place for numbers that are very large, floating point arithmetic soon fails and it is often better to perform either integer maths or to separate out the integer and fractional components and deal with each separately). Attachment 1 shows clearly the difference between the two spproaches but BOTH show no significant bias and both are reasonably evenly distributed about the mean = 0.

Attachment 2 (IMG_0002) shows a different, later calculation. Once again two approaches are compared with the "standard" by plotting the differences. Here the "double" method shows greatest departure from mean = 0 but the integer approach isn't that much better either. (My problem is that with such poor statistical knowledge the "much better" equates to little more than a quick visual inspection).

By a series of "trial and error" approaches I am able to squeeze out the last drops of useful precision out of the chain of algorithm. Then I move on to the next stage of the calculation and repeat the exercise and so on until I reach the end. The intermediate values have no merit other than highlighting the results of different approaches and from that choosingf the best before moving on. The final values can be compared similarly with the values from the "standard". At that point I know there is nothing further to be gained by varying any of the algorithms within the constraints of this particular technology. The question is then: "How accurate are these final results compared to the 'standard'?" That standard could equally be an almanac (for these are astronomical calculations).

So. If I take the differences (is this what you mean by 'residuals' - in which case this is what I have been doing) but now choose an appropriate size of each band and then the frequency for each band and plot those I will obtain a distribution. Do I then compare this with an expected normal distribution and quote the departure of the former from the latter?

To the Monte Carlo approach. This is something I have not heard of before but I see that it is very clever. I am not sure it is appropriate in my case. I hope, having explained things in more detail, that it is clear I am not trying to determine whether or not a particular hypothesis is a good fit but that, after running a series of optimisations, I end up with a set of values that cannot be improved further. I know that at that stage the error (whatever that is) will be least and that my final value ought to be quotable to within a certain degree of accuracy to another. There is then no point in quoting to a precision greater than that accuracy. I feel intuitively this is where I am heading but my ignorance is letting me down.

Thanks for your input - much appreciated!

Ric

6. ## Re: Comparing two sets of results

You can look at the final output, but one incentive for using a Monte-Carlo approach is that you can look at each stage and assess how each operation introduces more variation and exactly the nature of that variation.

So you do a simulation and lets say your algorithm has five main parts. What you do is you store all the random vectors generated corresponding to individual simulations at each stage of the process.

You might find the first two introduce a standard amount but the third part introduces a huge amount of variation, so then you focus on the third part.

So you look at the third part and then assess the distribution: in doing this you see that particular rounding does particular things to some variables and then from that you can decide what constraints you want to use on that stage and how that affects the final variation in the answer for the final result.

This is something to consider if you need to specify constraints for a complicated algorithm that is involved (many processes and operations and complex flow-control).

You can even write some simple code to flag you where the variation becomes abnormal by using simple decision boundaries for the variance of the distribution.

You can also answer your question which you have been asking from both a collective and a component wise point of view.