
Calculating average
Hi,
I am confused and am hoping that someone can clarify the following situation: Let say I have a dice that has 800 sides. Let say I am throwing this dice 50 times. How many different outcomes am i to expect ( outcome : the side that has its face down > not sure if 800 side dice has a face up). Since each outcome has the same chance to appear then i would expect 50 different sides to appear once when dice is thrown 50 times. I am probably wrong here because when i do the simulation
Code:
use strict;
my %dice_outcome;
my %dice_outcome_freq;
for (1..500000){
my %shash;
for (my $i = 0; $i<50;$i++){
$shash{int(rand(800)+1)}++;
}
my $types_cnt = keys %shash;;
my $freq_cnt = 0;
foreach my $key (keys %shash){
$freq_cnt += $shash{$key};
}
$dice_outcome_freq{$freq_cnt/$types_cnt}++;
$dice_outcome{$types_cnt}++;
}
foreach my $key( sort{$a<=>$b}keys %dice_outcome){
print "$key\t$dice_outcome{$key}\t".($dice_outcome{$key}/50000)."\n";
}
print "\n\n";
foreach my $key( sort{$a<=>$b}keys %dice_outcome_freq){
print "$key\t$dice_outcome_freq{$key}\t".($dice_outcome_freq{$key}/50000)."\n";
}
Result:
Distribution of different outcomes:
40 1 2e05
41 3 6e05
42 18 0.00036
43 184 0.00368
44 1178 0.02356
45 5715 0.1143
46 22052 0.44104
47 64227 1.28454
48 131513 2.63026
49 170835 3.4167
50 104274 2.08548
Distribution of frequencies of diffrent outcomes:
1 104274 2.08548
1.02040816326531 170835 3.4167
1.04166666666667 131513 2.63026
1.06382978723404 64227 1.28454
1.08695652173913 22052 0.44104
1.11111111111111 5715 0.1143
1.13636363636364 1178 0.02356
1.16279069767442 184 0.00368
1.19047619047619 18 0.00036
1.21951219512195 3 6e05
1.25 1 2e05
i do not get this uniform distribution. Is this because the central limit theorem. and how would i calculate the average number of outcomes without the simulation having only the input values:
Number of possible outcomes: 800
Number of throws: 50

What is the average number of different outcomes?
What is the average frequency of the average number of different outcomes?
Thank you.

Re: Calculating average

Re: Calculating average
hm.. yep that is true but i am lookint for the number of different outcome types. Let me scale ths proble down to a 6 outcome die. Let say i want to know the average number of different outcomes given i throw a die 4 times. So i have different posibilities:
I can get :
1,2,3,4 > [tex] \binom{6}{4} \times 4! [tex] > the number of ways i can get 4 different permutrd outcomes
1,1,2,3 > [tex] \binom{6}{3} \times \binom{4}{2} \times 2! [tex]
1,1,2,2 > [tex] \binom{6}{2} \times \binom{4}{2} \times 1! [tex]
1,1,1,2 > [tex] \binom{6}{2} \times \binom{4}{3} \times 1! [tex]
so actually what i am looking for is a genetar expresion for the above series ? Notice that when you allow larger alphabets and larger tuples there is higher number of subtuples that need to be considered.
(or maybe you are giving me the solution but i am not seeing it)

Re: Calculating average
You have an alphabet of N symbols each with a probability of 1/N of occurring.
You want to make K independent selections where in general K < N, and usually K << N
I guess you want to find the expectation of the number of unique symbols, say M, appear in this selection.
let's take a look
for M=1, i.e. you chose the same symbol K times
$$Pr[K]=\left(\frac{1}{N}\right)^K$$
and there are N ways of doing this so the overall probability is
$$N\left(\frac{1}{N}\right)^K=\left(\frac{1}{N} \right)^{K1}$$
for M=2, i.e. symbol1 appears L times, and symbol2 appears KL times
$$Pr[L,KL]=\left(\frac{1}{N} \right)^L\left(\frac{1}{N} \right)^{KL}$$
and this can happen N(N1) different ways
and so forth, at M=Q you have to account for the probability of all the various ways you can choose Q unique ways from K selections and then multiply that by the number of ways you can choose Q distinct symbols from N.
I'm pretty sure this is all just the multinomial distribution like I gave you a link for.