1. ## Gradient of a Euclidean function

Folks - I know this is too basic but I really need your help. I haven't mastered the properties of Matrix Differentiation. Haven't had any Matrix Calculus in college. Please share your knowledge. Thanks!
Seems like there's a relationship between differentiation of a matrix and its transposition. Please see the attached image.

2. ## Re: Gradient of a Euclidean function

Do you think you could post the problem so we can see what it is.

Looking at the image you posted I have no idea what you're trying to do.

3. ## Re: Gradient of a Euclidean function

Sorry. I missed something. Anyways, please see the screenshot below for the exact cost function. I'm trying to get the gradient wrt to A & X. You may solve A and will do for X the soonest I get the idea. Thanks!

4. ## Re: Gradient of a Euclidean function

Originally Posted by romsek
Do you think you could post the problem so we can see what it is.

Looking at the image you posted I have no idea what you're trying to do.
I think the issue is that you want

$\dfrac 1 2 \sum (Y-AX)^\dagger (Y-AX)$ rather than $\dfrac 1 2 \sum (Y-AX)^2$ but I'm really not sure.

5. ## Re: Gradient of a Euclidean function

What's varying with what? What's a function of what? What variable are you taking the derivative with respect to?

##########################

The main thing to watch out for when it comes to derivatives of matrices is that matrix multiplication is non-commutative.

For example, if you have matrices A(t) and B(t) (meaning the components of the matrix are functions of t), then the product rule for matrices looks like this: (d/dt)[AB] = (dA/dt)B + A(dB/dt).

Proof is the same as with the the ordinary product rule, but now you have to be careful to keep the order of the multiplication correct:

A(t + h)B(t + h) - A(t)B(t) = A(t + h)B(t + h) - A(t)B(t + h) + A(t)B(t + h) - A(t)B(t) = [ A(t + h) - A(t) ] B(t + h) + A(t) [ B(t + h) - B(t) ]. Now divide both sides by h and take the limit as h goes to 0.

In ordinary calculus you might derive the formula for the derivative of the reciprocal like this:

$\displaystyle \frac{f(x)}{f(x)} = 1$, so $\displaystyle \frac{d}{dx} \left( \frac{f(x)}{f(x)} \right) = 0$,

$\displaystyle \frac{d}{dx} \left$\left( f(x) \right) \left( \frac{1}{f(x)} \right) \right] = 0, so \displaystyle \left\[ \frac{d}{dx} f(x)\right$ \left( \frac{1}{f(x)} \right) + f(x) \left$\frac{d}{dx} \left( \frac{1}{f(x)} \right) \right$ = 0$, so

$\displaystyle f(x) \frac{d}{dx} \left( \frac{1}{f(x)} \right) = - \frac{1}{f(x)} \frac{df}{dx}$, so

$\displaystyle \frac{d}{dx} \left( \frac{1}{f(x)} \right) = - \frac{\left( \frac{df}{dx} \right)}{f(x)^2}$.

From there, you could then apply the product rule to get the quotient rule.

With matrices, you do the same "thing", except now you have to keep an eye on the order of multiplication for everything.

So what's the derivative of the inverse of a matrix?

$\displaystyle A^{-1}(t)A(t) = I$, so $\displaystyle \frac{d}{dt} \left( A^{-1}(t)A(t) \right) = \frac{d}{dt}(I) = 0$ (that's the 0 matrix).

From the product rule already shown, you get:

$\displaystyle \frac{d}{dt} \left( A^{-1}(t)A(t) \right) = \left$\frac{d}{dt} A^{-1}(t) \right$ A(t) + A^{-1}(t) \frac{d}{dt}A(t)$, so

$\displaystyle \left$\frac{d}{dt} A^{-1}(t) \right$ A(t) + A^{-1}(t) \frac{d}{dt}A(t) = 0$, so

$\displaystyle \left$\frac{d}{dt} A^{-1}(t) \right$ A(t) = - A^{-1}(t) \frac{d}{dt}A(t)$, so

$\displaystyle \frac{d}{dt} A^{-1}(t) = - A^{-1}(t) \left$\frac{d}{dt}A(t) \right$ A^{-1}(t)$.