Results 1 to 9 of 9

Math Help - Probability with randomly generated numbers

  1. #1
    Newbie
    Joined
    Jun 2010
    Posts
    2

    Probability with randomly generated numbers

    This question has me completely stumped. I've run a little program I wrote to find out that the answer averages to 520, but I have absolutely no idea how to figure it out.

    A hat has 100 pieces of paper numbered 1 to 100. If you randomly pick pieces of paper out of the hat, write down the number, then return the paper to the hat, how many picks (assuming the world has perfect probability) would it take until you have written down every number from 1 to 100?

    I'm being driven nuts.
    Follow Math Help Forum on Facebook and Google+

  2. #2
    MHF Contributor
    Joined
    Aug 2007
    From
    USA
    Posts
    3,111
    Thanks
    2
    It is not a good question. Theoretically, you could draw the same value each time and you certainly will not ever get them ALL. You'll have to rephrase.
    Follow Math Help Forum on Facebook and Google+

  3. #3
    MHF Contributor undefined's Avatar
    Joined
    Mar 2010
    From
    Chicago
    Posts
    2,340
    Awards
    1
    Quote Originally Posted by TKHunny View Post
    It is not a good question. Theoretically, you could draw the same value each time and you certainly will not ever get them ALL. You'll have to rephrase.
    I think the OP is asking what the expected number of draws is before reaching the stated goal, which is a perfectly reasonable question.

    I was initially perplexed as to how to deal with the 2^{100} states in this problem, until I realised that the problem can be viewed as only having 101 states.

    So, there are two approaches to this problem that I know of. One is to use Markov chains, and an exact answer can be obtained using matrix inversion. I have limited experience with this. If you are interested, there is a PDF available from Dartmouth that I've read part of and found good.

    The other approach I have plenty of experience with, and is as follows: We are after an expected value. Let X be the random variable: the number of draws before all numbers are written. Then

    E(X) = \sum_{i=1}^\infty i\cdot P(X=i) = 1\cdot P(X=1) + 2\cdot P(X=2) + \cdots

    This is an infinite sum, but eventually the terms will get small, allowing us to get several decimal places of precision if we use floating point arithmetic.

    The procedure is to use an array of size 101 indicating the probability that n numbers have been written, n in {0,1,...,100} for some number of steps, starting with 0 steps and then going to maybe 5000 to get several decimal places of precision. I started typing out an explanation but it's probably easier for me just to type out a program and show you.

    Give me a couple hours and I'll post again.
    Follow Math Help Forum on Facebook and Google+

  4. #4
    MHF Contributor undefined's Avatar
    Joined
    Mar 2010
    From
    Chicago
    Posts
    2,340
    Awards
    1
    Here's the program:

    Code:
    public class NumberHat100ExpVal {
        
        public static final int L=4000;
        
        public static void main(String[] args) {
            long time=System.currentTimeMillis();
            solve(100);
            System.out.println("Elapsed: "+(System.currentTimeMillis()-time)/1000.0);
        }
        
        public static void solve(int n) {
            int i, j;
            double a[]=new double[n+1], b[], c[]=new double[n+1], d[]=new double[n+1], e=0;
            a[0]=1;
            for(i=0; i<=n; i++) {
                c[i]=i/100.0;
                d[i]=1-c[i];
            }
            for(i=0; i<L; i++) {
                b=new double[n+1];
                for(j=0; j<n; j++)
                    if(a[j]!=0) {
                        b[j]+=c[j]*a[j];
                        b[j+1]+=d[j]*a[j];
                    }
                a=b;
                if(i>=n) e+=i*a[n];
            }
            System.out.println(e);
        }
    }
    And here is the output:

    Code:
    517.7377517639593
    Elapsed: 0.038
    This matches your estimate, so I'm fairly confident that it's bug-free. I can explain how it works, but I figure there's a nonzero probability you will be able to read and understand my code, so I'll just wait for you to ask.
    Follow Math Help Forum on Facebook and Google+

  5. #5
    Newbie
    Joined
    Jun 2010
    Posts
    2
    I'm not sure what language your code is in, but I did a very similar thing with a couple lines of Visual Basic - that's where I got that "magic number" 520 from. I'll look into that equation you spoke of.
    Follow Math Help Forum on Facebook and Google+

  6. #6
    MHF Contributor undefined's Avatar
    Joined
    Mar 2010
    From
    Chicago
    Posts
    2,340
    Awards
    1
    Quote Originally Posted by nickf77 View Post
    I'm not sure what language your code is in, but I did a very similar thing with a couple lines of Visual Basic - that's where I got that "magic number" 520 from. I'll look into that equation you spoke of.
    It's Java, sorry I forgot to mention that.

    Are you sure you did something similar? Because what I did is significantly different from Monte Carlo. I do not use a random number generator in my code. I forgot to put in units of time, but the output "Elapsed: 0.038" means that the code completes in 38 milliseconds.

    Again, let me know if you'd like explanation.

    EDIT: I think I uncovered a bug. It seems lines 28 and 29 were switched from what they should be. Here is the new code.

    Code:
    public class NumberHat100ExpVal {
        
        public static final int L=4000;
        
        public static void main(String[] args) {
            long time=System.currentTimeMillis();
            solve(100);
            System.out.println("Elapsed: "+(System.currentTimeMillis()-time)/1000.0+" seconds");
        }
        
        public static void solve(int n) {
            int i, j;
            double a[]=new double[n+1], b[], c[]=new double[n+1], d[]=new double[n+1], e=0;
            a[0]=1;
            for(i=0; i<=n; i++) {
                c[i]=i/100.0;
                d[i]=1-c[i];
            }
            for(i=0; i<L; i++) {
                b=new double[n+1];
                for(j=0; j<n; j++)
                    if(a[j]!=0) {
                        b[j]+=c[j]*a[j];
                        b[j+1]+=d[j]*a[j];
                    }
                if(i>=n) e+=i*a[n];
                a=b;
            }
            System.out.println(e);
        }
    }
    Output:
    Code:
    518.7377517639586
    Elapsed: 0.036 seconds
    I guess I'll code a Monte Carlo to try to ensure no other bugs.
    Last edited by undefined; June 9th 2010 at 03:03 PM.
    Follow Math Help Forum on Facebook and Google+

  7. #7
    Super Member
    Joined
    Mar 2008
    Posts
    934
    Thanks
    33
    Awards
    1
    This is the "Coupon Collector's Problem" with n=100. See

    Coupon collector's problem - Wikipedia, the free encyclopedia

    Using their approximate formula for the expected number of trials T required,

    E(T) \approx n \ln n + \gamma n + 1/2
    where \gamma \approx 0.5772 (the "Euler gamma constant")
    we find
    E(T) \approx 518.7
    Follow Math Help Forum on Facebook and Google+

  8. #8
    MHF Contributor undefined's Avatar
    Joined
    Mar 2010
    From
    Chicago
    Posts
    2,340
    Awards
    1
    Quote Originally Posted by awkward View Post
    This is the "Coupon Collector's Problem" with n=100. See

    Coupon collector's problem - Wikipedia, the free encyclopedia

    Using their approximate formula for the expected number of trials T required,

    E(T) \approx n \ln n + \gamma n + 1/2
    where \gamma \approx 0.5772 (the "Euler gamma constant")
    we find
    E(T) \approx 518.7
    Thanks, that gives me confidence that there are no other bugs in my most recent code and saves me from coding a Monte Carlo.
    Follow Math Help Forum on Facebook and Google+

  9. #9
    MHF Contributor
    Joined
    May 2010
    Posts
    1,030
    Thanks
    28
    Do the problem one step at a time:

    Find the expected tries to get your first ball
    Then the number of expected extratries to get the second ball
    Then the number of expected extratries to get the third ball

    You can do this for the system because it has the markov property.


    A complete proceedure for finding the expected number of tries is as follows:

    Step 1:
    Find f(k), the expected number of tries between matching the (k-1)th ball and the kth ball.

    Step 2:
    The expected number of steps in total is then

    f(1)+f(2)+f(3)+f(4)+f(5)...+f(100)


    I'll get you started on finding f(k)
    f(1) = 1 (obviously)

    for k>1, we will have to do an infinite sum to find our expectation. You should be able to see that the chance of drawing a new ball is:
    P(new ball) = (100-(k-1))/100

    Define this as a

    P(1 step) = a
    P(2 steps) = (1-a)(a)
    P(3 steps) = (1-a)(1-a)(a)
    P(d steps) = (1-a)^{d-1}a

    Using the definition of expectation, the expected number of steps is:
    f(k)=E(steps) = 1a + 2(1-a)a + 3(1-a)^2a + 4....
    f(k)=  a\left(1+ 2(1-a) +3(1-a)^2 + 4.... \right)

    This is a hypergeometric progression whch you can sum using the standard methods:

    f(k)=  a\left(1+ 2(1-a) +3(1-a)^2 + 4.... \right)

    (1-a)f(k) = a \left((1-a) + 2(1-a)^2 +3(1-a)^3 .... \right)

    (take difference between the above 2 equations)

    f(k)(1-(1-a)) = a \left(1 + (1-a) + (1-a)^2 + (1-a)^3...... \right)
    a f(k) = \frac{a}{1-(1-a)}
    f(k)=\frac{1}{a} (this should not be surprising!!)



    f(k) = \frac{100}{(100-(k-1))}

    edit:

    Now, its up to you to find the value of f(1) + f(2) +f(3)+...f(100). I dont remember if there is a series expansion for that, but there are only 100 terms so you can always put it in a spreadsheet if you get stuck.

    Spoiler:

    Exact answer 518.7377518


    editArg it was on wikipedia all along :<
    Last edited by SpringFan25; June 9th 2010 at 03:40 PM.
    Follow Math Help Forum on Facebook and Google+

Similar Math Help Forum Discussions

  1. plotting a randomly generated weibull distribution values
    Posted in the Advanced Statistics Forum
    Replies: 1
    Last Post: April 1st 2011, 04:34 AM
  2. Replies: 3
    Last Post: December 1st 2010, 04:38 AM
  3. Replies: 7
    Last Post: July 25th 2010, 08:26 PM
  4. Probability of 6 randomly selected
    Posted in the Statistics Forum
    Replies: 4
    Last Post: July 19th 2010, 12:57 PM
  5. probability that a randomly chosen member
    Posted in the Statistics Forum
    Replies: 1
    Last Post: November 2nd 2009, 03:09 PM

Search Tags


/mathhelpforum @mathhelpforum