Multivariable Calculus Decent Direction Proof

• Jan 31st 2014, 05:30 AM
fobos3
Multivariable Calculus Decent Direction Proof
I am reading the proof of the following lemma:
Let $\displaystyle f: \mathbb{R}^n \rightarrow \mathbb{R}$ be differentiable at $\displaystyle \bar{\mathbf{x}}}$. If $\displaystyle \nabla f(\bar{\mathbf{x}})^{\mathrm{T}} \mathbf{d} < 0$, then $\displaystyle \mathbf{d}$ is a decent direction for $\displaystyle f$ at $\displaystyle \bar{\mathbf{x}}}$.

I don't understand the first line of the proof, which states: because $\displaystyle f$ is differentiable at $\displaystyle \bar{\mathbf{x}}}$ then

$\displaystyle f(\bar{\mathbf{x}} + \lambda \mathbf{d}) = f(\bar{\mathbf{x}}) + \lambda\nabla f(\bar{\mathbf{x}})^{\mathrm{T}} \mathbf{d} + \lambda ||\mathbf{d}||\alpha (\lambda \mathbf{d})$

where
$\displaystyle \lim_{\lambda \rightarrow 0}\alpha (\lambda \mathbf{d}) = 0$

They define a decent direction as:
$\displaystyle \exists \delta > 0: f(\bar{\mathbf{x}} + \lambda \mathbf{d}) < f(\bar{\mathbf{x}}) \; \forall \lambda \in (0, \delta)$

I don't understand what $\displaystyle \alpha$ is exactly. Is it just the Fréchet derivative?
• Jan 31st 2014, 11:07 PM
chiro
Re: Multivariable Calculus Decent Direction Proof
Hey fobos3.

You should look at order functions in mathematics - namely big-O and little-o.

Big O notation - Wikipedia, the free encyclopedia

Linearization - Wikipedia, the free encyclopedia
• Feb 2nd 2014, 05:53 AM
hollywood
Re: Multivariable Calculus Decent Direction Proof
Chiro: I don't see what that has to do with the question.

- Hollywood
• Feb 2nd 2014, 02:40 PM
chiro
Re: Multivariable Calculus Decent Direction Proof
To me this looks like a multi-variate taylor series expansion in which one is approximating it through a linearization.

Its like how we use taylor series to expand f(x+a) for some arbitrary x or a with one of them known and the other is considered to be small. In the above case, we are doing it with a multivariable function using a vector instead of a scalar.

The Big-O notation gives us how the error behaves in regard to its parameters (namely lambda and the d vector). When these approach zero in magnitude and/or length then you get the result discussed above.

Big-O notation is used a lot in mathematics to look at how residuals or errors behave under certain conditions.