In several engineering publications I've come across a variational version of the classical Lagrange optimization method, which goes like this. Suppose you'd want to maximize a functional with respect to the function where is some given measure, under an inequality constraint of the form . You'd define a Lagrangian
What the authors then do is to set up the stationarity condition while apparently "removing" the integral:
This equation is then solved in , so you obtain a solution function which is a function of and . Finally, the Lagrange multiplier is chosen such as to fulfill the inequality constraint , which in most problems turns out to be unique, e.g. thanks to benign monotonicity properties.
This approach feels very sloppy to me, but seems to yield correct results. I've never understood why. Can you explain it to me?
The only explanation that I could think of, is that the integral is removed "tentatively" to see if one finds a solution to the Karush-Kuhn-Tucker conditions. That's legitimate. But then, to ensure we have the global solution, one would need a proof of the problem's convexity. Is it as profane as this, or is there a deeper explanation? Under which (sufficient) conditions is the above approach correct?