Results 1 to 4 of 4

Math Help - Filtering sample data -- catastrophic failure?

  1. #1
    Newbie
    Joined
    Dec 2012
    From
    United States
    Posts
    2

    Filtering sample data -- catastrophic failure?

    Is it legitimate to filter results AFTER a test, but BEFORE calculating statistics? Here are two examples, each using a sample size of 100:




    1. MEDICAL TEST


    Testing variations (several dozen) of a medical treatment, evenly split between sexes.


    Without treatment, all subjects die. When treated, all males die. However, in females, variations of the treatment result in a 25% to 90% cure rate.




    QUESTION: When comparing treatment variations, should the data for males be removed before calculating basic stats? All tests are done on 100 subjects, equally divided into 50 males and 50 females, but in each test all males die, so the median would be calculated on females only.






    2. COMMERCIAL TEST


    Testing variations of a "spray on" treatment for car tires, which is expected to increase tread wear; after application, each car must travel 5,000 miles. Tread will be measured before and after the trip.


    Upon application to all four tires, between 10% and 80% of them melt off the rim (all four melt or none do); thus, they do not complete (or even start) the road test. However, vehicles with tires that survive treatment do complete the road test; results vary from -20% to +150% change in tread wear, versus untreated (each test uses the same cars, with four tires of the same brand on each).




    QUESTION: Is it valid to remove from each test result data set those tires which melted, before comparing successful variations? This would mean that the median would be calculated on varying numbers of non-melted tires; ie: One data set might have 30 of 100 non-melted, another may have 60 of 100, yet another 90 of 100, etc.






    NOTE: In both of the above tests, the sample size is 100, but for actual tests, this number could vary from 15 to 1500.


    The main question is whether it is valid to remove data before comparing statistical measures such as the median, variance, etc., in the event of an unanticipated catastrophic failure.


    Please feel free to elaborate; any & all replies will be thought upon. Thank you for your time,




    --Johnathan
    Follow Math Help Forum on Facebook and Google+

  2. #2
    MHF Contributor
    Joined
    Sep 2012
    From
    Australia
    Posts
    4,163
    Thanks
    761

    Re: Filtering sample data -- catastrophic failure?

    Hey JohnathanStein.

    The key question you should ask is: "Will filtering your data in some particular way will introduce bias into your results in a way that confounds the analysis and gives you the wrong inference for your question"?

    The answer will involve the question you are trying to answer, the nature of the data and the process, how data is collected and sampled, and the expert knowledge within your domain amongst other things.
    Follow Math Help Forum on Facebook and Google+

  3. #3
    Newbie
    Joined
    Dec 2012
    From
    United States
    Posts
    2

    Re: Filtering sample data -- catastrophic failure?

    Quote Originally Posted by chiro View Post
    The key question you should ask is: "Will filtering your data in some particular way will introduce bias into your results in a way that confounds the analysis and gives you the wrong inference for your question"?
    Could you possibly give an answer in the cases of the two examples listed? Details and/or pitfalls would be appreciated. For example, in the first example (MEDICAL TEST), would it be accurate to compare medians of cure rates by drug formula for females only? If not, why not?


    Quote Originally Posted by chiro View Post
    The answer will involve the question you are trying to answer, the nature of the data and the process, how data is collected and sampled, and the expert knowledge within your domain amongst other things.
    Do you have any references, web or books, that might have practical examples? "Methods of Comparisons" or some such would probably suffice.

    Thanks,


    --Johnathan
    Follow Math Help Forum on Facebook and Google+

  4. #4
    MHF Contributor
    Joined
    Sep 2012
    From
    Australia
    Posts
    4,163
    Thanks
    761

    Re: Filtering sample data -- catastrophic failure?

    I'll take a look at the first question later on.

    But for the second one, this is why statisticians are professional advisors: it just takes a lot of experience to know what to look for and what to ask when speaking with a client.

    Unfortunately it is hard to clearly quantify this in a book and even qualify some of the concepts and if I knew a book that went through these things I wouldn't hesitate to give it to you.

    The mathematics is only a small part of what a statistician does: the real value of a statistician comes in the advice they gave and the better the advice, the better the statistician.
    Follow Math Help Forum on Facebook and Google+

Similar Math Help Forum Discussions

  1. Replies: 2
    Last Post: September 26th 2012, 10:10 AM
  2. Turning sample data with two variables into a continuous function
    Posted in the Advanced Applied Math Forum
    Replies: 5
    Last Post: June 10th 2011, 05:13 AM
  3. Categorical data and sample size
    Posted in the Statistics Forum
    Replies: 1
    Last Post: May 6th 2011, 04:04 AM
  4. How does one perform a Paired Sample t-Test with binary data?
    Posted in the Advanced Statistics Forum
    Replies: 9
    Last Post: December 17th 2009, 09:27 AM
  5. Estimation from Sample Data:
    Posted in the Statistics Forum
    Replies: 1
    Last Post: February 1st 2009, 10:38 AM

Search Tags


/mathhelpforum @mathhelpforum