# Thread: Multidimensional Newton's method optimization

1. ## Multidimensional Newton's method optimization

The followings. I know this is Hessian matrix. But i still could not understand how it work. Could you help me expand these 2 formula. And tell me how this process work.
----------------------------------
http://en.wikipedia.org/wiki/Newton%...n_optimization

This iterative scheme can be generalized to several dimensions by replacing the derivative with the gradient, , and the reciprocal of the second derivative with the inverse of the Hessian matrix, . One obtains the iterative scheme

Usually Newton's method is modified to include a small step size γ > 0 instead of γ = 1

2. Originally Posted by jasonHuang
The followings. I know this is Hessian matrix. But i still could not understand how it work. Could you help me expand these 2 formula. And tell me how this process work.
----------------------------------
Newton's method in optimization - Wikipedia, the free encyclopedia

This iterative scheme can be generalized to several dimensions by replacing the derivative with the gradient, , and the reciprocal of the second derivative with the inverse of the Hessian matrix, . One obtains the iterative scheme

Usually Newton's method is modified to include a small step size γ > 0 instead of γ = 1
We assume that the (scalar) function that we seek the minimum of is locally quadratic, and take the two term Taylor expansion for the gradient of the function (we are going to use N-R to find a zero of the gradient).

This gives:

$\displaystyle \nabla f(\bold{x})=\nabla f(\bold{a}) + \bold{H}f(\bold{a})(\bold{x}-\bold{a})$

Thus if $\displaystyle \bold x$ is the minimum we seek then $\displaystyle \nabla f(\bold x)=0$, and so:

$\displaystyle \bold Hf(\bold a)(\bold x-\bold a)=-\nabla f(\bold a)$

or:

$\displaystyle \bold x=\bold a-[\bold Hf(\bold a)]^{-1}\nabla f(\bold a)$

Now we use this as an itteration formula (because the function is not really a quadratic surface):

$\displaystyle \bold x_{n+1}=\bold x_n-[\bold Hf(\bold x_n)]^{-1}\nabla f(\bold x_n)$

CB