Results 1 to 7 of 7

Math Help - Chi Square test, or some other way?

  1. #1
    Junior Member
    Joined
    Nov 2006
    Posts
    42

    Chi Square test, or some other way?

    hi, here's the problem my girlfriend is having with some genetics she's working on.

    As you can see on the spreadsheet attached, A sample tests positive or negative for whatever it is she's testing. She also finds the concentration of some other thing in that given sample.

    She wants to know if there is some kind of relationship between Negativeness(or positiveness) and the lowness(or highness) of the concentration.

    She's talking about P values and Chi square tests and stuff, but we are not exactly sure how to (or whether we should) apply them to this problem. What would you suggest?

    Thanks very much for having a look
    Attached Files Attached Files
    Follow Math Help Forum on Facebook and Google+

  2. #2
    Senior Member
    Joined
    Apr 2006
    Posts
    399
    Awards
    1
    Quote Originally Posted by bruxism View Post
    hi, here's the problem my girlfriend is having with some genetics she's working on.

    As you can see on the spreadsheet attached, A sample tests positive or negative for whatever it is she's testing. She also finds the concentration of some other thing in that given sample.

    She wants to know if there is some kind of relationship between Negativeness(or positiveness) and the lowness(or highness) of the concentration.

    She's talking about P values and Chi square tests and stuff, but we are not exactly sure how to (or whether we should) apply them to this problem. What would you suggest?

    Thanks very much for having a look
    Hi. A Chi square test could and should be done. However, the test requires that there be at least 5 observations in each cell. That may obscure the relationship.

    Here's how I grouped the data for the Chi square test to keep at least 5 observations in each cell:
    Code:
    RNA         Positive  Negative
                 #  Pct     #  Pct
    <3           7  .58     5  .42
    >3 <9        8  .27    22  .73
    >9           6  .43     8  .57
    Total       21 .375    35 .625
    Low and high RNA concentrations produce more positive tests. Whether this is statistically significant should be tested with a Chi square test.

    To do this test you calculate an expected frequency for each cell using the total percentages. Example: for the cell "Positive <3" the expected is .375(7+5) = 4.5. Then the Chi square statistic is the sum over the cells of

    \frac{(actual\ frequency - expected\ frequency)^2}{expected\ frequency}.

    For cell "Positive <3" this is (7 - 4.5)^2 / 4.5 = 1.39. The sum over all the cells gives the Chi square statisitic of 3.9. It has degrees of freedom 6 - 1 - 3 = (3 - 1)(2 - 1) = 2 because 3 parameters (2 for the rows and 1 for the columns) are being estimated. The P-value of 3.9 with 2 DF is .14, not significant. (EDIT: I corrected the DF from 4 to 2 and the P-value from .42 to .14 per CaptainBlack's post below.)

    But I think the relationship is stronger at the high end than the grouped data show. You can see this if you sort the data by RNA concentration. (I haven't shown this here; it is better in color in the spreadsheet.)

    To show this relationship, I suggest a logit or probit analysis. With these, you test whether there is a linear or U-shaped relationship between the probability of a positive test and RNA concentration. For either of these analyses, you don't have the grouping restrictions of the Chi square test. But you need software such as SAS or SPSS for this.
    Last edited by JakeD; May 28th 2007 at 12:57 PM. Reason: Corrected degrees of freedom per CaptainBlack
    Follow Math Help Forum on Facebook and Google+

  3. #3
    Grand Panjandrum
    Joined
    Nov 2005
    From
    someplace
    Posts
    14,972
    Thanks
    4
    Quote Originally Posted by JakeD View Post
    For cell "Positive <3" this is (7 - 4.5)^2 / 4.5 = 1.39. The sum over all the cells gives the Chi square statisitic of 3.9. It has degrees of freedom 6 - 2 = 4. The P-value of 3.9 with 4 DF is .42, not at all significant.
    Now your calculation of the number of degrees of freedom for this left me
    uneasy, but I don't do cross tabular analysis every day so I look this up when
    I need it. It appears that DF=(rows-1)*(columns -1) = 2*1 = 2.

    A Chi-Square of 3.9 is still not significant with this number of degrees of
    freedom.

    RonL
    Follow Math Help Forum on Facebook and Google+

  4. #4
    Junior Member
    Joined
    Nov 2006
    Posts
    42
    thanks for the replies. That's made things a little more clear for us.

    a few questions though

    1. In regards to performing a chi square test on something like this, it seems the groups are being made somewhat at random. Choosing <3, >3 <9, >9 seems ok, but you could have chosen anything couldn't you? how do you decide? it seems you could skew the stats in this way to make it seem like something is occurring....

    2. you said "The P-value of 3.9 with 4 DF is .42, not at all significant". How do you calculate whether it is significant or not.

    3. The probit and logit analysis sounds like a good idea. Is there a way to perform these without buying additional software...or do you just have to bite the bullet and spend?

    Thanks for your time so far, it's been very helpful.
    Follow Math Help Forum on Facebook and Google+

  5. #5
    Senior Member
    Joined
    Apr 2006
    Posts
    399
    Awards
    1
    Quote Originally Posted by CaptainBlack View Post
    Now your calculation of the number of degrees of freedom for this left me
    uneasy, but I don't do cross tabular analysis every day so I look this up when
    I need it. It appears that DF=(rows-1)*(columns -1) = 2*1 = 2.

    A Chi-Square of 3.9 is still not significant with this number of degrees of
    freedom.

    RonL
    I don't do these every day either and I should have looked it up too. I corrected the post. Thank you.
    Follow Math Help Forum on Facebook and Google+

  6. #6
    Grand Panjandrum
    Joined
    Nov 2005
    From
    someplace
    Posts
    14,972
    Thanks
    4
    Quote Originally Posted by bruxism View Post
    2. you said "The P-value of 3.9 with 4 DF is .42, not at all significant". How do you calculate whether it is significant or not.
    Either a cumulative chi-squared distribution calculator, or a set of
    tables of critical values for the chi-squared distribution.

    Below is the help text and example calculation from the system that
    I use most frequently, when I'm not using the book of tables next to
    my desk.

    Code:
    >
    >help chidis
    chidis is a builtin function.
     
    normaldis(x) : returns the probability that a normally
    distributed (mean 0, st.dev. 1) is less than x.
    invnormaldis(p) : is the inverse.
    chidis(x,n) : chi-distribution with n degrees of freedom.
    tdis(x,n) : Student's t-distribution with n degrees of freedom.
    invtdis(p,n) : the inverse.
    fdis(x,n,m) : f-distribution with n and m degrees of freedom.
    >
    >chidis(3.9,2)
         0.857726 
    >chidis(3.9,4)
         0.580291 
    >
    Follow Math Help Forum on Facebook and Google+

  7. #7
    Senior Member
    Joined
    Apr 2006
    Posts
    399
    Awards
    1
    Quote Originally Posted by bruxism View Post
    thanks for the replies. That's made things a little more clear for us.

    a few questions though

    1. In regards to performing a chi square test on something like this, it seems the groups are being made somewhat at random. Choosing <3, >3 <9, >9 seems ok, but you could have chosen anything couldn't you? how do you decide? it seems you could skew the stats in this way to make it seem like something is occurring....
    I chose integer cutoffs at the high and low end to keep at least 5 observations in each cell. Looking back at the data, I see I didn't look at it too closely. I could have chosen a cutoff of 12 instead of 9 to get exactly 5 observations in both high categories. Doing that would have increased the significance, so it is true you may be able to cook the results somewhat by carefully selecting the categories. However, the Chi square test requires putting the data into categories; you have to do it somehow. Just don't be too cute when selecting the categories.

    2. you said "The P-value of 3.9 with 4 DF is .42, not at all significant". How do you calculate whether it is significant or not.
    Statistical convention says the P-value should be .05 or less to be statistically significant. I calculated the P-value in the spreadsheet using the function CHIDIST(3.9;2). (Note the DF is actually 2 per CaptainBlack and the P-value is .14.)

    3. The probit and logit analysis sounds like a good idea. Is there a way to perform these without buying additional software...or do you just have to bite the bullet and spend?
    I googled logit analysis free software. The first hit was software called EasyReg. I've never used it, but the price is right!

    Thanks for your time so far, it's been very helpful.
    My pleasure.
    Last edited by JakeD; May 28th 2007 at 02:31 AM.
    Follow Math Help Forum on Facebook and Google+

Similar Math Help Forum Discussions

  1. Chi-Square Test
    Posted in the Advanced Statistics Forum
    Replies: 2
    Last Post: November 19th 2009, 07:47 AM
  2. Need help with Chi-Square Test
    Posted in the Advanced Statistics Forum
    Replies: 0
    Last Post: October 21st 2009, 08:12 AM
  3. Chi square test
    Posted in the Advanced Statistics Forum
    Replies: 2
    Last Post: June 27th 2009, 06:40 AM
  4. Chi square test
    Posted in the Advanced Statistics Forum
    Replies: 2
    Last Post: June 19th 2009, 09:58 PM
  5. Chi Square test.
    Posted in the Statistics Forum
    Replies: 0
    Last Post: September 18th 2008, 09:56 AM

Search Tags


/mathhelpforum @mathhelpforum