Results 1 to 2 of 2

Math Help - Adjusting Probabilities To Match Empirical CDF

  1. #1
    Newbie
    Joined
    Oct 2010
    Posts
    1

    Adjusting Probabilities To Match Empirical CDF

    I have two data sets and can calculate the empirical CDF accordingly for each in Matlab. However, the ecdf's may or may not be similar to each other. My end goal is to force one data set to match the ecdf of the other. However, each ecdf does not necessarily match some known distribution (exponential, weibull, etc) very well, so I cannot rely on using inverse functions either (something some previous work has done - but I don't quite believe their assumptions that their ecdf matched the chosen distribution well enough). I also cannot reduce the number of instances of a certian data value in the set I am modifying - only increase them. This means I cannot simply have the two sets contain the same number of values and change all values in set 1 to match that of set 2. In the end, this means my modified set 1 will be larger than set 2, but overall the individual probabilities should still line up.

    I'm not sure if there is a way or method that already does this for me, but it sounds very similar to a problem when tuning a floating tremelo guitar. By increasing or decreasing certain strings to try to get those individual notes to match, it affects all other strings. In the end, they all need to be balanced...

    If we ignore the two ecdf and just try to make the probabilities of each value within the two data sets match up, and we want modify data set 1 to match data set 2. Say f1(1) = 10% and f2(1) = 15%. Then we need to add more 1 values to data set 1 until it reaches 15%. Unfortunately, after we do this, all other probabilities have changed as well. Now we do f1(2) = 8% and f2(2) = 11%. So we increase the number of 2's in data set 1. However, this affects all other values, and the tuned value for 1's has just been "untuned".

    In the end, I need to do this process on tens to hundreds of possible data values, all aligning to the proper probabilities. If I could get the probabilities to match within each data set, then the CDF would naturally match as well.

    Is there already an algorithm / method in matlab / something that would make my life easier in solving this? Otherwise I'm about to start doing some major coding. Bleh.
    Last edited by superman859; October 28th 2010 at 09:13 AM.
    Follow Math Help Forum on Facebook and Google+

  2. #2
    Grand Panjandrum
    Joined
    Nov 2005
    From
    someplace
    Posts
    14,972
    Thanks
    4
    Quote Originally Posted by superman859 View Post
    I have two data sets and can calculate the empirical CDF accordingly for each in Matlab. However, the ecdf's may or may not be similar to each other. My end goal is to force one data set to match the ecdf of the other. However, each ecdf does not necessarily match some known distribution (exponential, weibull, etc) very well, so I cannot rely on using inverse functions either (something some previous work has done - but I don't quite believe their assumptions that their ecdf matched the chosen distribution well enough). I also cannot reduce the number of instances of a certian data value in the set I am modifying - only increase them. This means I cannot simply have the two sets contain the same number of values and change all values in set 1 to match that of set 2. In the end, this means my modified set 1 will be larger than set 2, but overall the individual probabilities should still line up.

    I'm not sure if there is a way or method that already does this for me, but it sounds very similar to a problem when tuning a floating tremelo guitar. By increasing or decreasing certain strings to try to get those individual notes to match, it affects all other strings. In the end, they all need to be balanced...

    If we ignore the two ecdf and just try to make the probabilities of each value within the two data sets match up, and we want modify data set 1 to match data set 2. Say f1(1) = 10% and f2(1) = 15%. Then we need to add more 1 values to data set 1 until it reaches 15%. Unfortunately, after we do this, all other probabilities have changed as well. Now we do f1(2) = 8% and f2(2) = 11%. So we increase the number of 2's in data set 1. However, this affects all other values, and the tuned value for 1's has just been "untuned".

    In the end, I need to do this process on tens to hundreds of possible data values, all aligning to the proper probabilities. If I could get the probabilities to match within each data set, then the CDF would naturally match as well.

    Is there already an algorithm / method in matlab / something that would make my life easier in solving this? Otherwise I'm about to start doing some major coding. Bleh.
    I would consider Kernel Density Estimation and then see what I could do with the resulting density.

    CB
    Follow Math Help Forum on Facebook and Google+

Similar Math Help Forum Discussions

  1. Approximate joint probabilities using marginal probabilities
    Posted in the Advanced Statistics Forum
    Replies: 0
    Last Post: June 23rd 2011, 04:04 AM
  2. Replies: 2
    Last Post: April 27th 2010, 10:34 PM
  3. (wx)Maxima self-adjusting for/while/etc
    Posted in the Math Software Forum
    Replies: 0
    Last Post: April 17th 2010, 01:20 AM
  4. Adjusting a sample for population size
    Posted in the Statistics Forum
    Replies: 1
    Last Post: September 23rd 2009, 02:11 PM
  5. adjusting a clock
    Posted in the Math Topics Forum
    Replies: 1
    Last Post: March 1st 2008, 10:03 PM

Search Tags


/mathhelpforum @mathhelpforum