Hey SlipEternal.
Just before I continue with suggestions, are you familiar with the Poisson Distribution?
Poisson distribution - Wikipedia, the free encyclopedia
Greetings,
I have a problem similar to the following:
Suppose at some call center, there are a lot of people available to take calls. They are grouped according to their average success rate. There are groups of call takers with people in each respectively. Each group has a different average success rate. If each person takes a call at exactly the same time, what is the probability that they will succeed with exactly callers (across the entire call center)?
My approach is to sum over all possible solutions to a diophantine equation in variables where the sum must equal . Hence I have:
Any ideas for another approach that might be solvable in polynomial time? Or possibly a book I might take a look at that might offer some clue?
Edit: I am not a statistician. I have some experience with combinatorics, and I would prefer a combinatorial approach, but in this instance, it seems like statistics may be a more appropriate field to estimate the results. I really don't know enough about statistics to know where to begin looking, though.
Hey SlipEternal.
Just before I continue with suggestions, are you familiar with the Poisson Distribution?
Poisson distribution - Wikipedia, the free encyclopedia
That does, indeed, look a great deal like what I am trying to do. The wikipedia article offers several articles in the Notes section. I will begin reading through them today, but if you have any recommendations of which ones would cover the basics first, that would be extremely helpful. (I am starting with #2: "Statistics | The Poisson Distribution". Umass.edu. 2007-08-24. Retrieved 2012-04-05.)
Edit: Hmm, would a multinomial approach be a more appropriate setting for this? I was attempting to calculate probabilities using a binomial distribution, but it seems more like I am working with a multinomial one. Of course, I don't know enough about multinomial distributions, either.
Actually, I think what you may want to do if you are only looking at successes and failures is to use a Bernoulli distribution:
Bernoulli distribution - Wikipedia, the free encyclopedia
Basically this distribution models one trial of getting a success or a failure.
Now if you have different groups with different success rates you can actually use some probability to find the distribution of the sum of all the outcomes. A Bernoulli process gives a 1 if successful and a 0 if not which means that if you want to check whether all the callers got a success, you are testing whether the sum of all outcomes = the number of variables and there is a way to get a distribution (and hence a probability for that) and with discrete variables it's based on the Probability Generating Function:
Probability-generating function - Wikipedia, the free encyclopedia
What you will do is get the PGF for the different Bernoulli trials (which all have the same form) and calculate the PGF corresponding to the random variable of the sum of the others (i.e. Y = X1 + X2 + ... + XN). Once you have this, you use a specific way to get the probability of Y = N and then you're done.
I should have read your question more carefully and I apologize, but you were definitely on the right track: the only thing is that if you have all the trials with completely different rates you get something complicated. If however you had all the rates being the same, this is what is known as a Binomial distribution and is basically a sum of individual Bernoulli trials with the same probability of success.
The PGF allows you to get the distribution when the probabilities of success for each trial are not all the same.
Thank you very much. I had not considered a generating function. My experience with them is for single variable sequences. I really don't have a feel for multivariate ones. The only reference on the second Wikipedia entry you posted is for Univariate Discrete Distributions, which is also a reference for Bernoulli distributions. I have some understanding of the single variable generating functions. Would you recommend a resource that could help me understand multivariate ones?
The actual probabilities are not as important as what I would like to do with them. I am trying to look at optimization problems in several contexts. One of them is optimizing staffing, where new hires expect certain pay, but make a lot of mistakes. More experienced staff makes fewer mistakes, but expects more pay. I was interested in seeing if there was a way to optimize job success rates while minimizing salaries. In a different context, I was interested in changing call takers to coins where two gamblers are playing a game. They both start with a certain amount of money. They purchase coins from the house with known probabilities for flipping heads. Then, both gamblers flip all of their coins, and for each head that comes up, their opponent must choose one of their coins to return to the house. Between these two contexts, and possibly others, I would like to see if some general theory emerges relating to this type of optimization.
Edit:
Looking over multivariable generating functions a bit more, it is possible that my solution in my first post can be plugged directly into a generating function:
Is this what you mean by a probability generating function? This one is obviously single-variable, but I could probably make it multivariable. Using that, should it be possible for me to obtain the statistical data I am interested in? Again, a reference for how I can use the generating function would be extremely helpful. I really don't have enough intuition when it comes to statistics to know what I am looking at to narrow down my searches.
Ok, thinking about this some more, let's look at the case when . Then, this is simply a bernoulli trial, and we have . In the case that , then I have:
This looks very much like a multinomial expansion, but with a few extra terms. I will play around with this until I can make more sense of that generating function.
WolframAlpha figured it out for me. I have a multivariate hypergeometric distribution. Does anyone know a good book I might use to read up more about them? I was told the following are excellent books in general on Probability:
An Introduction to Stochastic Processes, by Edward P.C. Kao, Duxbury Press, 1997,
Introduction to Probability Models, by Sheldon M. Ross, Academic Press (eight edition) 2003
But, of course, a book that specifically covers hypergeometric distributions would be better.
Sheldon Ross's book is really good (I have it myself), but that version is really old.
What I'd suggest you do is go to Amazon and look at the Table Of Contents (TOC) for the book and see how much of this it includes.
With that aside though, the book covers quite a lot of stuff and is a good resource if you will be doing this kind of thing frequently.