# Thread: Unsure if I have used the correct test

1. ## Unsure if I have used the correct test

I studied this a while ago and am a bit rusty with remembering exactly how to get the reults I need. I basically have two sets of data and I want to see if there is any relationship to an increase in one if the other increases. Do I use a regression formula or a different test?

I have attached the data, graph and my conclusion. Please could someone have a look at it and either confirm I'm correct or shoot me down in flames and tell me I have got it totally wrong then point me in the right direction. I used the data analysis in excel rather than working the equations out myself.

Any help is greatly appreciated.

2. The regression approach is what i would have used, only i'd be careful when using excel. (It doesn't have a reputation for sound statistical calculations) You may also wish to mention in your conclusions that the accuracy of the model could be greatly improved by extra data, 10 data points is not a whole lot of data.

3. Originally Posted by MikeChch
I studied this a while ago and am a bit rusty with remembering exactly how to get the reults I need. I basically have two sets of data and I want to see if there is any relationship to an increase in one if the other increases. Do I use a regression formula or a different test?

I have attached the data, graph and my conclusion. Please could someone have a look at it and either confirm I'm correct or shoot me down in flames and tell me I have got it totally wrong then point me in the right direction. I used the data analysis in excel rather than working the equations out myself.

Any help is greatly appreciated.
How you do this depends on what you know.

You start with the null hypothesis that there is no connection between the contribution and the starting total.

Now you have a data set $(ST_i,CT_i), i=1, .. n$ and you need to construct a test to see if the null hypothesis can be rejected (I would use a non-parametric test myself, but what you use depends on what you know).

Example: The data below is sorted by starting total:

Code:
Contribution	Jackpot starting total

88385	28383
58365	36949
71006	45247
96025	48976
75226	64880
100266	131099
109887	138474
108119	142141
102729	154863
127109	155973
If there is no relation between the two columns then the number of entries in the first five positions in the first column less than 100000 is a binomial RV $B(5,0.5)$. We actually observe $5$, so we can ask what the probability of observing $5$ or more is $P(N>5)=\sum_{n\ge 5}} b(n,5,0.5) = b(5,5,0.5)=0.5^5\approx 0.031$.

CB

4. Thank you both for your replies. Just to make sure...the conclusions I came to are correct? If so, I can run the same test with other data I have which should give me more accuracy because I'll have over 200 data points.

5. In my opinion the conclusions you came to were correct, although it's not clear to me how you came to those conclusions by reading what you wrote. 200 Data points should give you a more accurate indication of the true nature of the trend, certainly more so than 10 points, but of course you may find that your conclusions could change when you re-analyse the data.

6. By the way your data look more like there is a threshold effect rather than a linear correlation, which considering what we are talking about is probably consistent with the psycho-physics involved.

7. Originally Posted by CaptainBlack
By the way your data look more like there is a threshold effect rather than a linear correlation, which considering what we are talking about is probably consistent with the psycho-physics involved.

Could you explain please. Unsure what you mean by the threshold effect.

8. Originally Posted by MikeChch
Could you explain please. Unsure what you mean by the threshold effect.
The contribution is more of less constant for all jackpot seed totals except for a step change at the magic jackpot seed of 100000.

CB