Results 1 to 8 of 8

Math Help - Inclusion-Exclusion - Maximizing Probabilities

  1. #1
    Newbie
    Joined
    Nov 2009
    From
    Curitiba - Brazil
    Posts
    10

    Inclusion-Exclusion - Maximizing Probabilities

    Let E_1, E_2, E_3 be three events, where Pr[E_1]=p_1, Pr[E_2]=p_2, Pr[E_3]=p_3. Also, p_1 + p_2 + p_3 = 1. Using the inclusion-exclusion principle, Pr[E_1 \cup E_2 \cup E_3] = p_1 + p_2 + p_3 - p_1p_2 - p_1p_3 - p_2p_3 + p_1p_2p_3 = f(p_1, p_2, p_3)$.


    Thus, I would like to prove that \max f(p_1,p_2,p_3) subject to p_1 + p_2 + p_3 = 1 has solution p_1 = p_2 = p_3 = 1/3. In order to do this I used Lagrange multipliers, and after a few manipulations on the equations, I come up with this answer.


    THE QUESTION IS: how can I generalize this for n events? More specifically, I want to prove that \max Pr[E_1 \cup \ldots \cup E_n] subject to p_1 + \ldots + p_n = 1 has solution p_1 = \ldots = p_n = 1/n.


    PS: Some directions to solve this problem would be useful too (or a reference where it was already solved).....
    Follow Math Help Forum on Facebook and Google+

  2. #2
    MHF Contributor
    Joined
    Sep 2012
    From
    Australia
    Posts
    4,033
    Thanks
    742

    Re: Inclusion-Exclusion - Maximizing Probabilities

    Hey andre.vignatti.

    Is there a way to get a factorization of this function with respect to a particular variable?

    What I mean by this is basically not to get a unique global factorization where you have products of terms, but if you can have a function to generate a factorization based on a particular variable (like factorizing with respect to p1 for example)?

    The reason why I suggest this is because I suspect that factorizing with respect to a specific variable should always give the same form regardless of the variable.

    As an example with three probabilities you have p1 + p2 + p3 - p1p2 - p1p3 - p2p3 + p1p2p3 gives with respect to the three variables:

    p1(1 - p2 - p3 + p2p3) + p2 + p3 - p2p3
    p2(1 - p1 - p3 + p1p3) + p2 + p3 - p1p3
    p3(1 - p1 - p2 + p1p2) + p1 + p2 - p1p2

    This helps immensely in the Lagrange multiplier process when it comes to doing derivations for n-variables. If you can show that the form is the same for every kind of factorization, then when you use the Lagrange technique, you should get a bunch of these terms that are all similar.
    Follow Math Help Forum on Facebook and Google+

  3. #3
    Newbie
    Joined
    Nov 2009
    From
    Curitiba - Brazil
    Posts
    10

    Re: Inclusion-Exclusion - Maximizing Probabilities

    Hello Chiro! Thanks for the answer. In fact, I already try some factorization like you sugest. I wonder if the factorization that you proposed is really useful in solving the three probabilites case.... I could not understand why, if I use the factorization proposed, could I obtain that p_1 = p_2 = p_3.

    So, let me explain how I obtained the three probabilities answer, and then I will ask a easier question that actually can solve my problem.

    ====================================

    We want to maximize

    f(p_1,p_2,p_3) = p_1 + p_2 + p_3 - p_1p_2 - p_1p_3 - p_2p_3 + p_1p_2p_3

    subject to

    p_1+p_2+p_3=1 \iff p_1+p_2+p_3-1=0 \implies g(p_1,p_2,p_3)=p_1+p_2+p_3-1

    Using the Lagrange multipliers, we get

    \nabla f(p_1,p_2,p_3)=\lambda \nabla g(p_1,p_2,p_3)

    So we get the following system (*):

    \lambda = 1 - p_2 -p_3+p_2p_3 \\ \lambda = 1 - p_1 -p_3+p_1p_3 \\ \lambda = 1 - p_1 -p_2+p_1p_2 \\ 1 = p_1+p_2+p_3

    Using the fact that 1 = p_1+p_2+p_3 in the top three rows, we get:

    \lambda = p_1 + p_2p_3 \\ \lambda = p_2 + p_1p_3 \\ \lambda = p_3 + p_1p_2 \\ 1 = p_1+p_2+p_3

    Multiplying p_i on the i-th row:

    \lambda p_1 = p_1^2 + p_1p_2p_3 \\ \lambda p_2 = p_2^2 + p_1p_2p_3 \\ \lambda p_3 = p_3^2 + p_1p_2p_3 \\ 1 = p_1+p_2+p_3

    Isolate p_1p_2p_3 we conclude that (*) \lambda p_1 - p_1^2 = \lambda p_2 - p_2^2 = \lambda p_3 - p_3^2. But

    \lambda x - x^2 = \lambda y - y^2 \\ \iff \lambda x - \lambda y = x^2 - y^2 \\ \iff \lambda (x - y) = (x+y)(x-y) \\ \iff \lambda = x+y

    So, for each pair of the equalities (*), we have

    \lambda = p_1 + p_2 \\ \lambda = p_1 + p_3 \\ \lambda = p_2 + p_3

    Combining these rows, THE ONLY SOLUTION IS p_1 = p_2 = p_3, and because 1 = p_1 + p_2 + p_3, we conclude that p_1 = p_2 = p_3 = 1/3

    ====================================

    In the first system obtained by the Lagrange multipliers (see *), one possible solution for the system is setting p_1 = p_2 = p_3. This clearly solves the system, as each row must be equal to the other (except for the last row). Therefore, there is no reason to carry the calculations until the end; the only reason would be to conclude that the system has a unique solution.

    So, in my attempt to generalize the three probabilities case to n probabilites, I think I could easily argue that the solution p_1 = p_2 = \ldots = p_n is a feasible solution to the system, although I cannot easily argue that it is the ONLY solution.

    So finally, MY QUESTION IS: If one solution (maybe not unique) to the Lagrange multipliers system is obtained, can I assume that this solution would maximize the original objective function???
    Follow Math Help Forum on Facebook and Google+

  4. #4
    MHF Contributor
    Joined
    Sep 2012
    From
    Australia
    Posts
    4,033
    Thanks
    742

    Re: Inclusion-Exclusion - Maximizing Probabilities

    To try and answer your question mathematically, one attempt you could do is that if your function can be written in terms of

    p1*g(p2,...,pn) + h(p2,....,pn) = 0
    p2*g(p1,...,pn) + h(p1,....,pn) = 0

    and so on, then when you do the differentiation with respect to that variable you get say for d/dp1:

    g'(p2,....,pn) = 0.

    Now if these functions are of the same form then you can collect, simplify and see what happens.

    I don't want to say yes to your answer even though it seems likely that its true.
    Follow Math Help Forum on Facebook and Google+

  5. #5
    Super Member
    Joined
    Mar 2008
    Posts
    934
    Thanks
    33
    Awards
    1

    Re: Inclusion-Exclusion - Maximizing Probabilities

    Quote Originally Posted by andre.vignatti View Post
    Let E_1, E_2, E_3 be three events, where Pr[E_1]=p_1, Pr[E_2]=p_2, Pr[E_3]=p_3. Also, p_1 + p_2 + p_3 = 1. Using the inclusion-exclusion principle, Pr[E_1 \cup E_2 \cup E_3] = p_1 + p_2 + p_3 - p_1p_2 - p_1p_3 - p_2p_3 + p_1p_2p_3 = f(p_1, p_2, p_3)$.


    Thus, I would like to prove that \max f(p_1,p_2,p_3) subject to p_1 + p_2 + p_3 = 1 has solution p_1 = p_2 = p_3 = 1/3. In order to do this I used Lagrange multipliers, and after a few manipulations on the equations, I come up with this answer.


    THE QUESTION IS: how can I generalize this for n events? More specifically, I want to prove that \max Pr[E_1 \cup \ldots \cup E_n] subject to p_1 + \ldots + p_n = 1 has solution p_1 = \ldots = p_n = 1/n.


    PS: Some directions to solve this problem would be useful too (or a reference where it was already solved).....
    Ahem. Are you sure that p_1 = p_2 = p_3 = 1/3 maximizes f? What about p_1 = 1, p_2 = p_3 = 0 ? Doesn't that give you a bigger f?
    Follow Math Help Forum on Facebook and Google+

  6. #6
    MHF Contributor
    Joined
    Sep 2012
    From
    Australia
    Posts
    4,033
    Thanks
    742

    Re: Inclusion-Exclusion - Maximizing Probabilities

    I'm assuming for all practical purposes, all the probabilities are non-zero ones (even though zero is a valid probability).

    So you would add the constraint that p1 > 0, p2 > 0, ..., pn > 0.

    In most real situations, we don't consider zero probabilities especially if we have a population distribution (there's no point in doing so).
    Follow Math Help Forum on Facebook and Google+

  7. #7
    Super Member
    Joined
    Sep 2012
    From
    Washington DC USA
    Posts
    525
    Thanks
    147

    Re: Inclusion-Exclusion - Maximizing Probabilities

    This is a case where it might initially be easier to understand written in English than using definitions and sigma-sum notations:

    For a generic n, f(p1, p2, ... pn) = (sum p's 1 at a time) - sum(p's 2 at a time) + sum(p's 3 at a time) - + ... (-1)^(n+1) sum(p's n at a time).

    Note sum(p's n at a time) is just the product of all the p's.
    Let's declare the domain of f to be: all the p's are in [0, 1).

    Now fix integer k, 1<=k<=n
    (sum p's 1 at a time) = pk + (sum p's other than pk, 1 at a time)
    (sum p's 2 at a time) = pk(sum p's other than pk, 1 at a time) + (sum p's other than pk, 2 at a time)
    (sum p's 3 at a time) = pk(sum p's other than pk, 2 at a time) + (sum p's other than pk, 3 at a time)
    (sum p's 4 at a time) = pk(sum p's other than pk, 3 at a time) + (sum p's other than pk, 4 at a time)
    ...
    (sum p's (n-1) at a time) = pk(sum p's other than pk, (n-2) at a time) + (sum p's other than pk, (n-1) at a time)
    (sum p's n at a time) = pk(sum p's other than pk, (n-1) at a time) + (0)

    That 0 is because there aren't n p's there once you exclude pk, so (sum p's other than pk, n at a time) isn't defined. Look back at f and you'll see this must be zero.

    With that in mind, let s_k(j)(p_1, p_2, ... p_n) = (sum p's other than pk, taken j at a time) for 0<=j<=n.

    Note that s_k(n) = 0. Also, will define s_k(0) = 1. (This is just a matter of convenience for later sums.)

    Note that the *value* of s_k(1) = 1 - p_k when the values of the p's are evaluated, but that's not how the function itself is defined, but rather is the arithmetic consequence of our knowing the constraint that the sum of the p's is 1. Keeping this straight is conceptually important since we'll be taking derivatives.

    I want to make clear that, by definition, s_k(1) = p_1 + p_2 + ... + \hat{p_k} + ... + p_n.

    Note that, by construction (and with the s_k(1) case hopefully clarified), have: \frac{\partial s_k(j)}{\partial p_k} = 0 \ \forall j.

    Now from these "(sum p's 3 at a time) = pk(sum p's other than pk, 2 at a time) + (sum p's other than pk, 3 at a time)" get:

    (sum p's j at a time) = p_k s_k(j-1) + s_k(j), 1 \le j \le n.

    Check boundary special cases:

    (sum p's 1 at a time) = p_k s_k(0) + s_k(1) = p_k (1) + s_k(1) = p_k + s_k(1). Check.

    (sum p's n at a time) = p_k s_k(n-1) + s_k(n) = p_k s_k(n-1) + (0) = p_k s_k(n-1). Check.

    All that is to get this (and "this" is to prepare to solve using Lagrange Multipliers):

    f(p_1, p_2, ... p_n) = \sum_{i=1}^n (-1)^{i+1} (all the p's taken i at a time)

    f(p_1, p_2, ... p_n) = \sum_{i=1}^n (-1)^{i+1} (p_k s_k(i-1) + s_k(i))

    And thus: \frac{\partial f}{\partial p_k} = \sum_{i=1}^n (-1)^{i+1} s_k(i-1)

    So far, it's all formalizing/defining/structuring the problem. No meat. Here comes the meat: Will now algebraically manipulate so that can write \frac{\partial f}{\partial p_k} in terms of f and p_k only:

    p_k\frac{\partial f}{\partial p_k} = \sum_{i=1}^n (-1)^{i+1} p_k s_k(i-1) = \sum_{i=1}^n (-1)^{i+1} ( p_k s_k(i-1) + s_k(i) - s_k(i) )

    = \sum_{i=1}^n (-1)^{i+1} ( p_k s_k(i-1) + s_k(i) ) - \sum_{i=1}^n (-1)^{i+1} s_k(i) = f - \sum_{i=1}^n (-1)^{i+1} s_k(i)

    (now set i = j-1 in the summation) = f - \sum_{j=2}^{n+1} (-1)^{j} s_k(j-1) = f + \sum_{j=2}^{n+1} (-1)^{j+1} s_k(j-1)

    = f - (-1)^{1+1} s_k(1-1) + (-1)^{1+1} s_k(1-1) + \sum_{j=2}^{n+1} (-1)^{j+1} s_k(j-1)

    = f - (-1)^{1+1} s_k(1-1) + \sum_{j=1}^{n+1} (-1)^{j+1} s_k(j-1)

    = f - (-1)^{1+1} s_k(1-1) + (-1)^{(n+1)+1} s_k((n+1)-1) + \sum_{j=1}^{n} (-1)^{j+1} s_k(j-1)

    = f - s_k(0) + (-1)^n s_k(n) + \sum_{j=1}^{n} (-1)^{j+1} s_k(j-1)

    = f - (1) + (0) + \sum_{j=1}^{n} (-1)^{j+1} s_k(j-1)

    = f - 1 + \sum_{j=1}^{n} (-1)^{j+1} s_k(j-1) = f - 1 + \frac{\partial f}{\partial p_k}.

    Have just shown that: p_k\frac{\partial f}{\partial p_k} = f - 1 + \frac{\partial f}{\partial p_k}.

    (Here's why p_k = 1 had to be excluded.) Solving that for \frac{\partial f}{\partial p_k} gives:

    \frac{\partial f}{\partial p_k} = \frac{1-f}{1-p_k}.

    Now do Lagrange: Let g(p_1, p_2, ... , p_n) = p_1 + p_2 + ... + p_n.

    Then the constraint is g(p_1, p_2, ... , p_n) = 1, so find staionary points at (\tilde{p}_1, \tilde{p}_2, ... , \tilde{p}_n) via Lagrange:

    \left. \triangledown f\right|_{(\tilde{p}_1, \tilde{p}_2, ... , \tilde{p}_n)} = \lamba \left. \triangledown (g - 1)\right|_{(\tilde{p}_1, \tilde{p}_2, ... , \tilde{p}_n)}, which gives that:

    \left. \frac{\partial f}{\partial p_k}\right|_{(\tilde{p}_1, \tilde{p}_2, ... , \tilde{p}_n)} = \lambda \left. \frac{\partial g}{\partial p_k}\right|_{(\tilde{p}_1, \tilde{p}_2, ... , \tilde{p}_n)} = \lambda (1) = \lambda \ \forall k \in \{1, 2, ..., n\}.

    Thus

    \left. \frac{\partial f}{\partial p_k}\right|_{(\tilde{p}_1, \tilde{p}_2, ... , \tilde{p}_n)} = \frac{1-f(\tilde{p}_1, \tilde{p}_2, ... , \tilde{p}_n)}{1-\tilde{p}_k} = \lambda \ \forall k \in \{1, 2, ..., n\}.

    (To solve, this will require that \lambda \ne 0, so f(\tilde{p}_1, \tilde{p}_2, ... , \tilde{p}_n) \ne 1. Check this requirement at the end.)

    Have 1-\tilde{p}_k = \frac{1-f(\tilde{p}_1, \tilde{p}_2, ... , \tilde{p}_n)}{\lambda} \ \forall k \in \{1, 2, ..., n\} (\lambda \ne 0).

    Thus \tilde{p}_k = 1 - \frac{1-f(\tilde{p}_1, \tilde{p}_2, ... , \tilde{p}_n)}{\lambda} \ \forall k \in \{1, 2, ..., n\} (\lambda \ne 0).

    So at a stationary point, all the \tilde{p}_k equal the same constant \left(1 - \frac{1-f(\tilde{p}_1, \tilde{p}_2, ... , \tilde{p}_n)}{\lambda}\right),

    and so they're all equal to one another. When that's combined with the constraint \tilde{p}_1 + \tilde{p}_2 +  ... + \tilde{p}_n = 1, it follows that:

    If (\tilde{p}_1, \tilde{p}_2, ... , \tilde{p}_n) is a staionary point for f, then, if f(1/n, 1/n, ... , 1/n) \ne 1,

    must have that \tilde{p}_1 = \tilde{p}_2 = ... = \tilde{p}_n = \frac{1}{n}.

    I'm not going to do the check that n>1, f(1/n, 1/n, ... , 1/n) \ne 1, though it is technically necessary.

    --------------------------------------
    Putting it all together: If none of the p's are 1, then the only stationary point for f is at (1/n, 1/n, 1/n, ... 1/n).
    --------------------------------------

    Aside: To decide that it's a local maximum, compute the Bordered Hessian at the point (1/n, 1/n, 1/n, ...). You can get all the 2nd partials easily:

    \frac{\partial f}{\partial p_k} = \frac{1-f}{1-p_k} \Rightarrow \frac{\partial^2 f}{\partial p_l\partial p_k} = \frac{\partial}{\partial p_l} \left( \frac{1-f}{1-p_k} \right)

    Then \frac{\partial}{\partial p_l} \left(\frac{1-f}{1-p_k}\right) = - \frac{1-f}{(1-p_k)(1-p_l)} when l \ne k, and

    \frac{\partial}{\partial p_k} \left(\frac{1-f}{1-p_k}\right) = \frac{(1-p_k)(- \frac{1-f}{1-p_k}) - (1-f)(-1)}{(1-p_k)^2} = 0.

    Therefore \frac{\partial^2 f}{\partial p_l\partial p_k} = - \frac{1-f}{(1-p_k)(1-p_l)} if l \ne k, and 0 if l = k.
    Last edited by johnsomeone; September 28th 2012 at 03:00 AM.
    Follow Math Help Forum on Facebook and Google+

  8. #8
    Super Member
    Joined
    Mar 2008
    Posts
    934
    Thanks
    33
    Awards
    1

    Re: Inclusion-Exclusion - Maximizing Probabilities

    Quote Originally Posted by chiro View Post
    I'm assuming for all practical purposes, all the probabilities are non-zero ones (even though zero is a valid probability).

    So you would add the constraint that p1 > 0, p2 > 0, ..., pn > 0.

    In most real situations, we don't consider zero probabilities especially if we have a population distribution (there's no point in doing so).
    OK, since my first hint appears to be too subtle:

    You have not maximized f by choosing all the probabilities equal. You have minimized f.
    Follow Math Help Forum on Facebook and Google+

Similar Math Help Forum Discussions

  1. Inclusion Exclusion
    Posted in the Discrete Math Forum
    Replies: 3
    Last Post: March 9th 2011, 11:38 AM
  2. inclusion exclusion
    Posted in the Advanced Statistics Forum
    Replies: 1
    Last Post: September 30th 2010, 09:58 AM
  3. inclusion & exclusion help
    Posted in the Discrete Math Forum
    Replies: 11
    Last Post: May 14th 2010, 10:57 AM
  4. Inclusion-Exclusion
    Posted in the Discrete Math Forum
    Replies: 4
    Last Post: April 15th 2010, 03:44 PM
  5. Inclusion/Exclusion
    Posted in the Discrete Math Forum
    Replies: 1
    Last Post: November 3rd 2009, 09:46 AM

Search Tags


/mathhelpforum @mathhelpforum