Greetings,

I am in an introductory real analysis course. Our professor claims that one of the problems in our text likely cannot be solved, so he assigned us a much simpler problem (provable directly from the standard multivariable analogue which assumes that the function is continuously differentiable). I began working on the original problem to try to figure out why it can't be solved as asked.

Note: I am not terribly proficient with LaTeX, so please forgive any abuses of notation. Likely, I am abusing notation due to my troubles in generating the proper formulas.

The problem states:

Let $\displaystyle f : U \to {\mathbb{R}}^{m}$ be differentiable, $\displaystyle [p,q] \subset U \subset {\mathbb{R}}^n$. Assume that the set of derivatives

$\displaystyle S=\left\{ (Df)_x \in \mathcal{L}(\mathbb{R}^n,\mathbb{R}^m) : x \in \left[ p,q \right] \right\}$

is convex. Prove that there exists $\displaystyle \theta \in \left[ p,q \right]$ which satisfies

$\displaystyle f(q)-f(p)=(Df)_\theta (q-p)$.

My professor says that he doesn't see where the author was going with this, and he doesn't believe convexity to be enough. He believes a stronger condition, such as continuity of the derivative, is required.

So, I set off to work. Here is what I came up with so far:

If $\displaystyle p=q$ then the solution follows trivially, so assume $\displaystyle p\neq q$

Differentiability of $\displaystyle f$ implies that for every $\displaystyle x \in \left[ p,q \right]$

$\displaystyle f(x+v)=f(x)+(Df)_x (v)+R(v)\Rightarrow \lim_{|v| \to 0} \frac{R(v)}{|v|}$

Claim:

$\displaystyle \forall \epsilon >0 \forall x \in [p,q] \exists \delta >0 \text{ such that } |x-t(q-p)|<\delta \Rightarrow \left|f(x+t(q-p))-f(x)-(Df)_x(t(q-p))\right| \le \frac{|t(q-p)|}{|q-p|}\epsilon$

Proof:

$\displaystyle f(x+t(q-p))-f(x)-(Df)_x(t(q-p))=R(t(q-p))$

So, assume that for some $\displaystyle \epsilon>0, x \in \left[ p,q \right]$ no such delta exists. This implies that

$\displaystyle \lim_{|t(q-p)| \to 0} \left| \frac{R(t(q-p))}{|t(q-p)|}\right| > \lim_{|t(q-p)| \to 0} \frac{\frac{|t(q-p)|\epsilon}{|q-p|}}{|t(q-p)|} = \frac{\epsilon}{|q-p|}>0$, a contradiction.

Because $\displaystyle [p,q]$ is an interval, with no loss of generality, assume $\displaystyle n=1$.

Given $\displaystyle \epsilon>0$, choose $\displaystyle \delta_x$ for each $\displaystyle x \in [p,q]$ as shown to exist in the claim above.

The collection $\displaystyle \mathcal{C}=\left\{U_{\delta_x}(x): x \in [p,q]\right\}$ is an open cover for $\displaystyle [p,q]$. By covering compactness, there exists a finite subcover.

Choose one such subcover $\displaystyle \mathcal{C}^*=\left\{ U_{\delta_{x_i}}(x_i)\right\} $ indexed so that $\displaystyle 1 \le i < j \le k$ implies $\displaystyle |x_i-p| \le |x_j-p|$.

Next, if any ball $\displaystyle U_{\delta_{x_i}} \subset U_{\delta_{x_j}}, i \neq j$, then remove $\displaystyle U_{\delta_{x_i}}$ from $\displaystyle \mathcal{C}^*$

(Here is a good example of abuse of notation, and I apologize. I am not sure what other symbols are available to use in place of this to show the refinement of the finite open cover.)

What should be left is an open cover with the following properties:

$\displaystyle U_{\delta_{x_i}}(x_i) \cap U_{\delta_{x_{i+1}}}(x_{i+1}) \neq \emptyset, p \in U_{\delta_{x_1}}(x_1), q \in U_{\delta_{x_k}}(x_k)$

Let $\displaystyle x_0=p, x_{k+1}=q$.

Let $\displaystyle t_{2i-1}=\min\left\{\delta_{x_i}, \frac{|x_i-x_{i-1}|}{|q-p|}\right\}$

Assume $\displaystyle t_{2k+1}=0$

Let $\displaystyle t_{2i}=\frac{|x_{i+1}-x_i|}{|q-p|}-t_{2i+1}$

Essentially, I am determining midpoints between each point in my partition of $\displaystyle [p,q]$. By construction,

$\displaystyle \sum_{i=1}^k{t_{2i-1}+t_{2i}}=1$ and each $\displaystyle 0\le t_\alpha \le 1$

So, by repeated applications of the Taylor Approximation Theorem, we have:

$\displaystyle f(q)-f(p)=\sum_{i=1}^k{\left((Df)_{x_i}(t_{2i-1}+t_{2i})(q-p)+R_{x_i}(t_{2i-1}(q-p))+R_{x_i}(t_{2i}(q-p))\right)}$

Because $\displaystyle |R_{x_i}(t(q-p))|\le t\epsilon$ we have

$\displaystyle \left|f(q)-f(p)-\sum_{i=1}^k{\left((Df)_{x_i}(t_{2i-1}+t_{2i})(q-p)\right)}\right|=\left|\sum_{i=1}^k{\left(R_{x_i} (t_{2i-1}(q-p))+R_{x_i}(t_{2i}(q-p))\right)}\right|$

$\displaystyle \le \sum_{i=1}^k{\left(t_{2i-1}+t_{2i}\right)\epsilon}=\epsilon$

Finally, we have a convex combination of derivative matrices that when multiplied by $\displaystyle (q-p)$ are a distance of no more than epsilon from $\displaystyle |f(q)-f(p)|$. So, we can find derivative matrices that are close to the average derivative.

It is clear that the limit exists and tends to $\displaystyle \frac{f(q)-f(p)}{|q-p|}$. However, because our set of derivatives $\displaystyle S$ is not closed, this limit is not necessarily contained in $\displaystyle S$.

This is the point I was able to get to on my own. Now, I want to try to understand under what circumstances such a limit would be contained in the set. It is possible that it requires continuity of the derivative, although I am hoping that I will see something else. Personally, my intuition is telling me that the author was correct, and convexity is enough, but I am not quite grasping the final argument.

Here is what I gather (described in a rather informal manner, as I have not yet figured out the mathematics required to prove my claim nor even begin to describe it):

When $\displaystyle \epsilon$ is small and $\displaystyle \delta_x$ is large, it implies a large range of values for which the derivative matrix at x adequately maps the change in the function over a large distance. Therefore, locally, the larger $\displaystyle \delta_x$ is, the more linear the function appears. Linearity implies constancy of the derivative, and a constant derivative is locally continuous. Therefore, the limit as $\displaystyle \epsilon$ approaches zero tends to weight the convex combination more heavily with derivatives at points in neighborhoods that are mostly linear, yet weighing derivatives less heavily at points in neighborhoods that are extremely non-linear. So, I would like to say that the convexity of the set of derivatives somehow implies Riemann integrability, allowing me to use the Fundamental Theorem of Calculus to discover the average derivative.

Now, I also know that I can use uniform approximations of the derivative to uncover a continuous family of functions that converge pointwise to my derivative, and then I could further use Baire's theorem to show that my derivative has a dense set of continuity points. However, I am not sure how to proceed. We have not yet gotten to Lebesgue measures, so I don't yet feel comfortable using that.

Would anyone have any suggestions of how I might proceed from here? Either proving that convexity of the set of derivatives is a sufficient condition for the general analogue to the Mean Value Theorem, or if that is not possible, providing insight into why it is not?