I am reading the proof of the following lemma:
Let be differentiable at . If , then is a decent direction for at .
I don't understand the first line of the proof, which states: because is differentiable at then
They define a decent direction as:
I don't understand what is exactly. Is it just the Fréchet derivative?
To me this looks like a multi-variate taylor series expansion in which one is approximating it through a linearization.
Its like how we use taylor series to expand f(x+a) for some arbitrary x or a with one of them known and the other is considered to be small. In the above case, we are doing it with a multivariable function using a vector instead of a scalar.
The Big-O notation gives us how the error behaves in regard to its parameters (namely lambda and the d vector). When these approach zero in magnitude and/or length then you get the result discussed above.
Big-O notation is used a lot in mathematics to look at how residuals or errors behave under certain conditions.