I am a linguist and I came across a linguistic problem connected with statistics. I've already solved it by "brute forse", with a little program calculating empirically the results, but a mathematical approximation could be useful..
The problem can be expressed as follow:
imagine to have 2 boxes.
In the first box you have the following elements : A, A, B, B, B , C
In the second one you have : A, A, A, B, C, C
note: the elements in the two boxes are not equiprobable.
Now assume to take 2 elements from the box 1 (putting back the first element you took so that when you will take the second element you will still have all the original elements in the box, and considering the order of extraction), and 2 elements from the box 2.
Imagine to do that 10 times for the box 1, and 5 times for the box 2.
My question is: what is the probability to have 1,2,3,..,n matches among the 10 couples of elements taken from the box one, and the 5 taken from the box 2?
I can experimentally approximate it, but I would like to find a formula to determinate it. It seems to me that the result could be approximated with a Poisson density function, but I don't understand how to calculate the lambda..