How do I differentiate a vector times vector?

• May 13th 2009, 11:17 AM
Twig
How do I differentiate a vector times vector?
Hi

This came up as I was reading about Non-linear least squares fitting.

We have data points $\displaystyle (t_{i},y_{i}) \, i=1 \cdots \, m$ .
We wish to find the vector $\displaystyle \vec{x}$ of parameters that gives the best fit in the least squares sense.

So we have our model function $\displaystyle f(t,\vec{x}) \mbox{ , }f:\mathbb{R}^{n+1}\longrightarrow \mathbb{R}$ .

Now we define the residual function $\displaystyle \vec{r}:\mathbb{R}^{n}\longrightarrow \mathbb{R}^{m}$ by $\displaystyle r_{i}(\vec{x})=y_{i}-f(t_{i},\vec{x}) \mbox{ , }i=1 \cdots \, m$ .

Then we wish to minimize the function $\displaystyle \phi(\vec{x})=\frac{1}{2}\vec{r}(\vec{x})^{T}\vec{ r}(\vec{x})$ .

Here is my question, how would I get the gradient of this function?

The book says $\displaystyle \nabla \phi(\vec{x}) = J^{T}(\vec{x})\vec{r}(\vec{x})$ , where J denotes the Jacobian.

I donīt really follow..

thanks!
• May 13th 2009, 10:12 PM
NonCommAlg
Quote:

Originally Posted by Twig
Hi

This came up as I was reading about Non-linear least squares fitting.

We have data points $\displaystyle (t_{i},y_{i}) \, i=1 \cdots \, m$ .
We wish to find the vector $\displaystyle \vec{x}$ of parameters that gives the best fit in the least squares sense.

So we have our model function $\displaystyle f(t,\vec{x}) \mbox{ , }f:\mathbb{R}^{n+1}\longrightarrow \mathbb{R}$ .

Now we define the residual function $\displaystyle \vec{r}:\mathbb{R}^{n}\longrightarrow \mathbb{R}^{m}$ by $\displaystyle r_{i}(\vec{x})=y_{i}-f(t_{i},\vec{x}) \mbox{ , }i=1 \cdots \, m$ .

Then we wish to minimize the function $\displaystyle \phi(\vec{x})=\frac{1}{2}\vec{r}(\vec{x})^{T}\vec{ r}(\vec{x})$ .

Here is my question, how would I get the gradient of this function?

The book says $\displaystyle \nabla \phi(\vec{x}) = J^{T}(\vec{x})\vec{r}(\vec{x})$ , where J denotes the Jacobian.

I donīt really follow..

thanks!

$\displaystyle \phi(x)=\frac{1}{2}\sum_{i=1}^n (r_i(x))^2.$ thus for any $\displaystyle 1 \leq j \leq n$ we have $\displaystyle \frac{\partial \phi}{\partial x_j}=\sum_{i=1}^n r_i(x) \frac{\partial r_i}{\partial x_j},$ which is exactly $\displaystyle \nabla \phi(x) = J^T(x) r(x).$