1. [SOLVED] Advanced Multivariable Calculus: Constrained Extrema Test

Hello,

I'm reading C.H. Edwards' "Advanced Calculus of Several Variables". I don't understand completely his proof of theorem II.8.9, concerning a "second derivative test" for constrained (Lagrange multiplier) maximum-minimum problems. Specifically, i don't understand the last paragraph of the proof, explaining why the condition on delta can be satisfied. I'd appreciate a more detailed and more formal treatment of this point.

This theorem, in its general form as presented in Edwards, is hard to find in other books. So i've emailed Prof. Edwards, asking him for a clarification of the point stated above, offering in exchange a list of typos and other errors i'd found in the book. But it's been several days now, and i haven't received any answer from him.

2. The relevant pages

Attached are the relevant pages from C. H. Edwards, Jr.'s "Advanced Calculus of Several Variables", Dover 1994 (an unabridged, corrected republication of the work first published by Academic Press, New York, 1973). The book can be purchased online in Amazon and Barnes & Noble, for instance.

3. My book review on Amazon

If anyone's interested, i've just posted a book review of this title on Amazon.

4. Originally Posted by itai
Attached are the relevant pages from C. H. Edwards, Jr.'s "Advanced Calculus of Several Variables", Dover 1994 (an unabridged, corrected republication of the work first published by Academic Press, New York, 1973). The book can be purchased online in Amazon and Barnes & Noble, for instance.
Here's a (very detailed) explaination for the anotated paragraph (which indeed the author should have developped further, I think):

Recall $\displaystyle m$ is the minimum value of $\displaystyle q$ on the unit ball of $\displaystyle T_a$. Notice this is also the minimum of $\displaystyle q\left(\frac{h}{|h|}\right)$ over the non-zero $\displaystyle h$ in $\displaystyle T_a$. Since $\displaystyle q$ is positive definite, we have $\displaystyle m>0$.

We can also define $\displaystyle M$ as the maximum of $\displaystyle |q|$ on the unit sphere of $\displaystyle \mathbb{R}^d$. Then for every $\displaystyle h\in\mathbb{R}^d$, $\displaystyle |q(h)|\leq M|h|^2$.

Let $\displaystyle \varepsilon_0>0$ (we'll choose its value later).

The "tough" part is to show that if $\displaystyle h$ is in a neighboorhood of zero and $\displaystyle a+h\in M$, then $\displaystyle \frac{h}{|h|}$ is uniformly close to the tangent space $\displaystyle T_a$. Then, we need to manipulate the quadratic form with care to estimate $\displaystyle q(h)$.

The tangent space $\displaystyle T_a$ is the orthogonal subspace of the subspace $\displaystyle (T_a)^\perp$ spanned by the vectors $\displaystyle \nabla G_1(a),\ldots,\nabla G_m(a)$.
As a consequence, there exist real numbers $\displaystyle \lambda_1,\ldots,\lambda_m$ such that the orthogonal projection on $\displaystyle (T_a)^\perp$ writes, for all $\displaystyle h\in \mathbb{R}^n$, $\displaystyle p_{(T_a)^\perp}(h)=\lambda_1 (\nabla G_1(a)\cdot h)\nabla G_1(a) + \cdots + \lambda_m (\nabla G_m(a)\cdot h) \nabla G_m(a)$.

On the other hand, for every $\displaystyle i$, $\displaystyle G_i(a+h)=\nabla G_i(a) \cdot h + o(|h|)$ and, because $\displaystyle G_i$ is $\displaystyle \mathcal{C}^2$, the small o can be made uniform: for every $\displaystyle \varepsilon>0$, there exists $\displaystyle \delta>0$ such that, if $\displaystyle |h|<\delta$, $\displaystyle |G_i(a+h)-\nabla G_i(a)\cdot h|< \varepsilon |h|$, and hence, if in addition $\displaystyle a+h\in M$, then $\displaystyle |\nabla G_i(a)\cdot h|<\varepsilon |h|$.

As a consequence of the last two paragraphs, there exists $\displaystyle \delta>0$ such that, for every $\displaystyle h\in\mathbb{R}^n$, if $\displaystyle |h|<\delta$ and $\displaystyle a+h\in M$, then $\displaystyle |p_{(T_a)^\perp}(h)|< \varepsilon_0 |h|$. As a consequence, $\displaystyle |q(p_{(T_a)^\perp}(h))|<M\varepsilon_0^2 |h|^2$.

We also deduce that if $\displaystyle |h|<\delta$ and $\displaystyle a+h\in M$, then $\displaystyle |p_{T_a}(h)|=|h-p_{(T_a)^\perp}(h)|\geq |h|-|p_{(T_a)^\perp}(h)| > (1- \varepsilon_0)|h|$, so that, since $\displaystyle p_{T_a}(h)\in T_a$, $\displaystyle q(p_{T_a}(h))=|p_{T_a}(h)|^2 q\left(\frac{p_{T_a}(h)}{|p_{T_a}(h)|}\right) > (1-\varepsilon_0)^2 |h|^2 m$.

Now, $\displaystyle q(h)=q\left(p_{T_a}(h)+p_{(T_a)^\perp}(h)\right)$. We would like to use the two previous bounds. There is no triangular inequality for $\displaystyle q$ on $\displaystyle \mathbb{R}^n$, but we can write, for every $\displaystyle h,k\in\mathbb{R}^n$, $\displaystyle q(h+k)=q(h)+q(k)+\varphi(h,k)\geq q(h)-|q(k)|-C |h||k|$ where $\displaystyle \varphi$ is the bilinear form associated with $\displaystyle q$ and $\displaystyle C$ is a positive constant (this is the maximum of $\displaystyle \varphi(h,k)$ for $\displaystyle h,k$ on the unit sphere of $\displaystyle \mathbb{R}^n$). We deduce, in our situation: if $\displaystyle |h|<\delta$ and $\displaystyle a+h\in M$, then
$\displaystyle q(h)\geq q(p_{T_a}(h))-|q(p_{(T_a)^\perp}(h))| - C |p_{T_a}(h)||p_{(T_a)^\perp}(h)|$ $\displaystyle \geq (1-\varepsilon_0)^2 m|h|^2 - M \varepsilon_0^2 |h|^2 - C\varepsilon_0^2 |h|^2$
(I used the fact that $\displaystyle |p_{T_a}(h)|\leq |h|$ (by Pythagoras'theorem)), hence:
$\displaystyle q\left(\frac{h}{|h|}\right)\geq (1-\varepsilon_0)^2 m - M \varepsilon_0^2 - C\varepsilon_0^2.$
If $\displaystyle \varepsilon_0$ is chosen small enough, the latter quantity can be made arbitrarily close to $\displaystyle m$ (from below).

As a conclusion: choose $\displaystyle \varepsilon_0>$ such that $\displaystyle (1-\varepsilon_0)^2 > \frac{3}{4}$ and $\displaystyle (M + C)\varepsilon_0^2<\frac{m}{4}$ (these quantities do only depend on $\displaystyle q$, not on $\displaystyle \delta$ of course). Then, if $\displaystyle |h|<\delta$ and $\displaystyle a+h\in M$,
$\displaystyle q\left(\frac{h}{|h|}\right)\geq \frac{3}{4}m - \frac{1}{4}m = \frac{m}{2}$. This is what we wanted.

(and the first condition on $\displaystyle \delta$ is just a consequence of the Taylor expansion)

5. Thank you very much!

Hi Laurent,

Thank you very much. That's marvellous!

There's just one proposition in your proof i fail to understand, namely:
"there exist real numbers such that the orthogonal projection on writes, for all , ."

The left side of the equation is an n-dimentional vector (in an m-dimensional vector subspace), while the right side is a real number.

6. Originally Posted by itai
There's just one proposition in your proof i fail to understand, namely:
"there exist real numbers such that the orthogonal projection on writes, for all , ."

The left side of the equation is an n-dimentional vector (in an m-dimensional vector subspace), while the right side is a real number.
Thank you, I fix it right now. It should be: $\displaystyle p_{(T_a)^\perp}(h)=\lambda_1 (\nabla G_1(a)\cdot h)\nabla G_1(a) + \cdots + \lambda_m (\nabla G_m(a)\cdot h) \nabla G_m(a)$.

If the $\displaystyle \nabla G_i(a)$ are orthonormal, the projection on $\displaystyle (T_a)^\perp$ is just the sum of the projections $\displaystyle h\mapsto (\nabla G_i(a)\cdot h)\nabla G_i(a)$. Otherwise, we can apply a Gram-Schmidt orthonormalization and we get $\displaystyle p_{(T_a)^\perp}$ as a linear combination of the previous projections.