I don't understand this distribution.
Are you saying that...
then...
and
which doesn't sum to one.
I'm lost.
Here is the full explanation:
There are two risk factors that may cause an accident to happen. A risk factor is either present (x = 1) or absent (x = 0).
We have a function that specifies probability p(x1, x2) of an accident, but not the outcome in every concrete case.
We assume that each risk factor reduces the probability of a good outcome (no accident) by some factor . In other words, the unknown function
p is of the form , so we have to learn parameters .
How would you compute the ML hypothesis from the data, under the given assumption on p?