# Taking the partial derivative of a CDF?

• Jan 2nd 2011, 04:38 AM
daaaaave
Taking the partial derivative of a CDF?
I know the derivative of a CDF is the PDF, but what if you want to take a partial derivative. For example, I have a standard normal difference variable

$\displaystyle Y = (Xi-Xj)/(sqrt(Var(Xi) + Var(Xj)))$

I would like to take the derivative with respect to Xi. I am thinking this just invokes the chain rule and would give:

$\displaystyle PDF(Y) * (1/sqrt(VarI + VarJ))$

Is that correct?
• Jan 2nd 2011, 06:36 AM
CaptainBlack
Quote:

Originally Posted by daaaaave
I know the derivative of a CDF is the PDF, but what if you want to take a partial derivative. For example, I have a standard normal difference variable

$\displaystyle Y = (Xi-Xj)/(sqrt(Var(Xi) + Var(Xj)))$

I would like to take the derivative with respect to Xi. I am thinking this just invokes the chain rule and would give:

$\displaystyle PDF(Y) * (1/sqrt(VarI + VarJ))$

Is that correct?

Perhaps if you post the original question that might help

CB
• Jan 2nd 2011, 06:57 AM
daaaaave
It's not homework or anything. I'm just deriving something for my own use. I'm just unsure if this is the correct way to take the partial derivative of the cdf for a normal random variable.

I'm doing a maximum likelihood where my parameters are the X's and variances, if that helps.
• Jan 2nd 2011, 09:19 AM
CaptainBlack
Quote:

Originally Posted by daaaaave
It's not homework or anything. I'm just deriving something for my own use. I'm just unsure if this is the correct way to take the partial derivative of the cdf for a normal random variable.

I'm doing a maximum likelihood where my parameters are the X's and variances, if that helps.

The likelihood is a function of the data and the parameters of the problem and maximisation is with respect to the parameters treating the data as fixed.

I do not see that formulation here. You are not distinguishing clearly between the parameters and data.

CB
• Jan 2nd 2011, 10:02 AM
daaaaave
Ok, so let me try to explain fully what I'm attempting to do.

I am assuming there are N normal random variables with parameters X (mean) and Var. Let's say we take 1 observation from 2 of these random variables at a time and are told which RV was greater but not their value. Then, I believe the likelihood function is:

f(Q | X, Var) = the product of P(A > B) where A is the sample from which the greater observation was drawn

The data (Q) just tells us if we are using P(A > B) or P(B > A) in the likelihood function for that observation.

Then, I take the log and try to maximize with respect to the parameters (X, Var) for each random variable (A, B, .....). Since P(A > B) is just a standard normal difference CDF, I am taking the derivative of that.

Does that make sense?
• Jan 2nd 2011, 11:37 AM
CaptainBlack
Quote:

Originally Posted by daaaaave
Ok, so let me try to explain fully what I'm attempting to do.

I am assuming there are N normal random variables with parameters X (mean) and Var. Let's say we take 1 observation from 2 of these random variables at a time and are told which RV was greater but not their value. Then, I believe the likelihood function is:

f(Q | X, Var) = the product of P(A > B) where A is the sample from which the greater observation was drawn

The data (Q) just tells us if we are using P(A > B) or P(B > A) in the likelihood function for that observation.

Then, I take the log and try to maximize with respect to the parameters (X, Var) for each random variable (A, B, .....). Since P(A > B) is just a standard normal difference CDF, I am taking the derivative of that.

Does that make sense?

No not like that, it might but that is still difficult to follow. It would be best to find some way of writing out the likelihood explicitly.

Also there will be no unique maximum of the likelihood since as far as I can see:

The likelihood for $\displaystyle (\bold{X},\bold{V})$ is the same as that or $\displaystyle (\bold{X}+c,\bold{V})$ and $\displaystyle (\lambda \bold{X}, \sqrt{\lambda}\bold{V})$, where $\displaystyle \bold{X}$ and $\displaystyle \bold{V}$ are the means and variances of the $\displaystyle $$N presumably independent RVs, and \displaystyle$$ c$ and $\displaystyle$$\lambda$ are scalars