Just to confirm, before deciding what distribution fits your situation, when you say they flush "once and only once" is it not possible for someone to flush 0 times in an hour?
Hi! I'm a newbe to this forum, hope you can help!
I'm looking at probability function for the sum of flushes during a fixed hour at night (this would be the sum for a district).
And let's say I measure in 10 min interval, that would make 6 values.
Let's say that I know there are 10 people up during this hour, and each could flush in all of those timeslots then:
Y1=X1+X2..+X10
Y2=X1+X2..+X10
...
Y6=X1+X2..+X10
Y1 is the sum of flushes for timeslot1,
all X are equal distibuted with p=1/6, and independent
Then Y1 would be Bin(n,p)=Bin(10;1/6) right?
E[Y1]=np = 10*1/6 = 1,6666.. Var[Y1] =np(1-p) = 1,3888..
If I'm looking att a 1hour series,
All Y could have value 0 if no one flushes, and 10 for that matter.
One example of outcome of the first two timeslots
Y1=1 + 1 +...+1
Y2=1 + 1 +...+1
Now to the twist:
If each flushes once and an only once , how to I go about that?
Y1=1+ 0+.. +0
Y2=0+ 0+..+1 (if X1 has been 1 the other must be 0)
All X are independent from each other in the same timeslot (different persons)
But all Y are now dependent since you can and must only flush once
Not knowing what to do I've been fooling around with Hypergeometric Hyp(N,n,m)
and sum of Hypergeometric, saying flush is 1 white ball and the other 5 black.
But I'm on very thin Ice here, I don't know what to do.
How do I get Y2? How do I get Var[Y1]?
E[Y1] would still be 1/6 i guess.
Hope you can help, I read some stocastics a long long (long) time ago..
Thanks for your reply!
No, flushing is just once and zero in all other timeslots for each Xn
One realisation could be (With Y=x1+x2+x3+x4 to make it shorter)
1 =1 0 0 0
1 =0 1 0 0
0 =0 0 0 0
2 =0 0 1 1
0 =0 0 0 0
0 =0 0 0 0
E[y1+y2+..+y6] would be completely deterministic, 4 every time
and E[y1] would be 4/6 (non deterministic)
In my case with X1+X2+..+X10 same principle 10 and 10/6
Suppose you are trying to find Y_{k}. And suppose j people have not flushed before the k^{th} interval.
There are combinations of j people from 10. 10Cj is the Combination function
j people must have not flushed k-1 times. The probability of that is
10-j people must have flushed already.
For one person
Then the probability that 10-j people have not flushed is
Therefore the probability that j out of 10 people have not flushed k-1 times is
Given that j have not flushed, the probability that i people will flush in the k^{th} interval is
j can be anywhere between i and 10. So the probability that i people will flush in the k^{th} interval is
This simplifies a bit to
And that is the expression for the probability that Y_{k} =i
Hi !
Thanks!
I saw your answer first time today!
What is you conclusion compared to the Bionom distribution?
I was so confused a few days ago that I wrote down every combination by hand
taken 4 people and 3 timeslots (3 x 20min)
that is 3^4 = 81 combinations! Here is the first 6
4 1111 3 1110 ......
0 0000 1 0001
0 0000 0 0000
...............
And I arrived at
0 1 2 3 4 sum
16 32 24 8 1 freq
1 4 6 4 1 PascTri
I also made the numbers in Excel
16,00
32,00
24,00
8,00
1,00
I believe this is nothing but the (n over k) here (4 over k) k=0,1,.. 4
with the Pascal triangle nr above
And I checked with the Binom(4,1/3) and got exactly the same numbers.
I came to realise that I took a long ride to arrive at Binom never the less.
There are one differens though.
In the "each person must flush once and only once" I Have a constant sum = 4 over every 3 trials.
Hence it's more "stable" than the Binom(4,1/3) that can have the result sum over 3 trials =0=0+0+0 or 12=4+4+4
I belive in the long run they generate the same "population" but Binom is taking longer time doin so.
Have you come to the same conclusion?
Thanks agan for your interest Shakarri!
I realized what is going on. If you assume that each of the 81 combinations has the same probability then it should be a binomial distribution, but they are not all equally common so the expression is more complicated.
When you listed every combination you assumed that every one of the 81 outcomes were of equal probability equal to 1/81. There were 16 times when 0 people flushed and 1 time when 4 people flushed so you thought that the probability that 0 flush is 16/81 and the probability that 4 people flushed was 1/81. But this is not true, the probability of each of the 81 outcomes is not equal 1/81 so the probability that 0 flush is not equal to 16/81. The probability of each outcome depends on whether someone had flushed previously so it is not simply 1/81.
The other thing is that does not simplify to