Results 1 to 7 of 7

Math Help - t-test

  1. #1
    Newbie
    Joined
    Aug 2008
    Posts
    19

    t-test

    I am trying to calculate some t-test results for a rather large dataset. The dataset has been log transformed because it has a positive skew. The distributions I am trying to compare are independent and are different sizes, so I am using Welch's approximation. I get the below equation:

    (xbar1-xbar2)/sqrt(s1/n1)+(s2/n2)

    (5.74-4.32)/sqrt((0.50/21936158)+(0.16/65008))

    where xbar is the mean of the sample, s is the variance (the square of the standard deviation), and n is the number of samples.

    The variance of the log transform is obviously quite small, but the amount of samples is large. I am therefore getting a very large negative number for the t-test result. This can't be right can it? I think I must be misinterpreting one of the parameters, but I'm not sure which. When I try to calculate the degrees of freedom I get a ridiculous number too.

    Thanks for any assistance
    Follow Math Help Forum on Facebook and Google+

  2. #2
    Grand Panjandrum
    Joined
    Nov 2005
    From
    someplace
    Posts
    14,972
    Thanks
    4
    Quote Originally Posted by newbie2008 View Post
    I am trying to calculate some t-test results for a rather large dataset. The dataset has been log transformed because it has a positive skew. The distributions I am trying to compare are independent and are different sizes, so I am using Welch's approximation. I get the below equation:

    (xbar1-xbar2)/sqrt(s1/n1)+(s2/n2)

    (5.74-4.32)/sqrt((0.50/21936158)+(0.16/65008))

    where xbar is the mean of the sample, s is the variance (the square of the standard deviation), and n is the number of samples.

    The variance of the log transform is obviously quite small, but the amount of samples is large. I am therefore getting a very large negative number for the t-test result. This can't be right can it? I think I must be misinterpreting one of the parameters, but I'm not sure which. When I try to calculate the degrees of freedom I get a ridiculous number too.

    Thanks for any assistance
    How large are the samples and how large the skew?

    CB
    Follow Math Help Forum on Facebook and Google+

  3. #3
    Newbie
    Joined
    Aug 2008
    Posts
    19

    t-test

    I should add some background. I am looking at the distribution of a species throughout Europe. I have an area where the species is present, and an area where it is not. The two areas are independent as it is only presence / abscence. Over each area I also have precipitation distribution. I want to compare the precipitation distribution for the area where the species is present, to the area where it is not.

    So I assume that in this case I have two populations. One where species is present, which is represented by the amount of pixels, which is 65088. The second is where the species is not present, which is 15641137. I have a precipitation distribution for each. The first is positivley skewed (1.23) I haven't measured the skewness of the second yet but it is also positively skewed.

    I want to state whether the two precipitation distributions are significantly different. So I have log transformed them, and I was going to compare the means and the kurtosis. I thought that a t-test would also help me compare the significance of the difference betwen them. Is this fair?

    I have tried the t-test on the population, and I can't seem to get a sensible answer. Do I instead only need to take a representative sample? If so, how would I calculate the sample size I need?
    Follow Math Help Forum on Facebook and Google+

  4. #4
    Newbie
    Joined
    Aug 2008
    Posts
    19
    One more point to add, if I am taking samples, is it better to have them equal in size, even though the populations and variances will be differently sized, or to have it as a proportion of the population?

    Thanks
    Follow Math Help Forum on Facebook and Google+

  5. #5
    Master Of Puppets
    pickslides's Avatar
    Joined
    Sep 2008
    From
    Melbourne
    Posts
    5,234
    Thanks
    27
    Quote Originally Posted by newbie2008 View Post
    I have tried the t-test on the population, and I can't seem to get a sensible answer. Do I instead only need to take a representative sample? If so, how would I calculate the sample size I need?
    Yes you do a t-test for the population on a represetative sample. There are assumptions and conditions for doing such a test. You should make sure your data does not violate these.

    If you are unsure of your distribution type you may want to employ a distribution free test like a Wilcoxon Sign Rank Test.

    You wouldn't do a t-test on the entire population because if you have the entire population you can find the means, stdev's etc.

    The sample size required depends on the confidence level you require.

    There are many online calculators that can determine this. Here's one

    Sample Size Calculator - Confidence Level, Confidence Interval, Sample Size, Population Size, Relevant Population - Creative Research Systems
    Follow Math Help Forum on Facebook and Google+

  6. #6
    Newbie
    Joined
    Aug 2008
    Posts
    19
    Ok I'm getting there, I just have a couple more questions...

    a) I've had a look at the Wilcoxon test you suggested. As the groups are independent, would it be more appropriate to use the Mann Whitney-Wilcoxon test?

    b) I'm still having a little difficulty defining to myself what is the sample and what is the population! What I have is a distribution purporting to be the population of a species over Europe (65088 pixels). However it is obviously not totally accurate due to the resolution of the data, and measurement errors (The error has been estimated so I can state this). Does this mean that it is instead actually still a sample of an unknown actual population?

    c) In terms of sample sizes, if I use a 10 000 pixel sample size, I can say that I am 99% confident that the sample I use will be representative of the population (i.e. that the mean precipitation of my sample will be the mean precipitation of the actual population) at a confidence interval of 1.29. Is this a percentage (i.e. 1.29% above and below the mean precipitation?)


    Just trying to get it clear in my head!


    Thank you for your help
    Follow Math Help Forum on Facebook and Google+

  7. #7
    Newbie
    Joined
    Aug 2008
    Posts
    19

    Exclamation Above comments

    Would anyone be able to consider the above comments?

    Thanks in advance
    Follow Math Help Forum on Facebook and Google+

Similar Math Help Forum Discussions

  1. Replies: 2
    Last Post: May 21st 2010, 09:56 AM
  2. Replies: 8
    Last Post: March 28th 2010, 05:45 PM
  3. Replies: 7
    Last Post: June 4th 2009, 09:33 PM
  4. Huh? Ratio test contradicts root test?
    Posted in the Calculus Forum
    Replies: 4
    Last Post: March 25th 2009, 12:07 PM
  5. ratio test (series) urgent test tommorow!
    Posted in the Calculus Forum
    Replies: 3
    Last Post: December 2nd 2008, 03:27 PM

Search Tags


/mathhelpforum @mathhelpforum