1. ## Advanced Methods in Applied Math

I am excited about starting my second tutorial on another subject. My first one was intended for less knowledgable people in math. This one is intended for more serious learners in mathematics. Since there are more people who use math for applied reasons and because it is easier to learn than pure mathematics I decided to go with the greater good and type one closer to applied math. It is assumed that the reader is well-versed in Multivariable Calculus, has a good understanding of multiple integration and partial differenciation.

This first lecture will be on Change of Variables. I choose this topic because it is usually not studied in a standard Calculus sequence and because I happen to think it is very useful at times. In single variable calculus if you have the integral of the form $\int_a^b f(g(x)) \cdot g'(x) dx$ then by defining $u=g(x)$ this integral reduces to $\int_{g(a)}^{g(b)} f(u) du$, this is the all popular and useful Substitution Rule. In multivarible calculus we have a similar situation with changing variables. However, things get inevitably more complicated because we are no longer integrating on a line, we are integrating in a plane (or in space). One thing we need to be able to do is to figure out how the region of integration is transformed under a specific Change of Variable. We will adopt the following notation: $R_{xy}$ which will stand for the region of integration in the $xy$-plane. And we will use the notation $R_{uv}$ which will stand for the region of integration in the $uv$-plane. Note, we are first going to discuss double integrals, and then once we mastered them we could generalize this concept. The idea (which will be explained in much detail later) is that given a two variable function $f(x,y)$ we define two new variables $u=u(x,y)$ and $v=v(x,y)$ thereby transfering $f(x,y)$ to $f(u,v)$ which will hopefully be easier to deal with. There is just one thing we need to watch out for, we need to be sure that $u(x,y) \mbox{ and }v(x,y)$ are invertible functions in $R_{xy}$. The reason for this will become apparent, we will need to solve for $x,y$ in terms of $u,v$, which means we need to somehow invert the function.

The following theorem is due to Karl Jacobi, its proof is way beyond the scope of this text (in fact I do not know it), it can be found on most textbooks on Multivariable Analysis. Like all analysis theorems it needs to have well-behaved conditions for it to work, but to keep things simple for you and for myself we will not trouble ourselves with this details.

Change of Variables: Let $f$ be a continous function on region $R_{xy}$ and let a transformation be defined as $x=g(u,v) \mbox{ and }y=h(u,v)$ which is one-to-one (invertible). And let $R_{uv}$ be the image of region $R_{xy}$ under this transformation. Where $g,h$ are continously differenciable on $R_{uv}$ then,
$\iint_{R_{xy}} f(x,y) dA = \iint_{R_{uv}} f[g(u,v),h(u,v)] \cdot \left| \frac{\partial (x,y)}{\partial (u,v)} \right| dA$
Where, $\frac{\partial (x,y)}{\partial (u,v)}$ is called the "Jacobian" and is defined as,
$\left| \begin{array}{cc} \frac{\partial x}{\partial u} & \frac{\partial x}{\partial v} \\ \frac{\partial y}{\partial u} & \frac{\partial y}{\partial v} \end{array} \right| = \frac{\partial x}{\partial u}\cdot \frac{\partial y}{\partial v} - \frac{\partial y}{\partial u} \cdot \frac{\partial x}{\partial v}$.

Wow! That looks dangerous. It looks so hard to use. Well, it is not so simple and many steps are required. But with practice this becomes useful and I hope you will start using it from time to time. The best way to demonstrate this theorem in power is through an example. Do not forget the absolute value in the Jacobian!

Example 1: Compute $\iint_{R_{xy}} \left( \frac{x-y}{x+y} \right)^3 dA$ where $R_{xy}$ is the triangular region $x+y=1$ in the first quadrant. This is really a complicated integral without the use of the Jacobian. But the form of the function suggests that we define $u=x-y$ and $v=x+y$. (Note these are invertible functions, so we can solve for other variables with them). Hence $x = \frac{v+u}{2}$ and $y=\frac{ v-u}{2}$. The next step is to compute the Jacobian: $\left| \begin{array}{cc} x_u & x_v \\ y_u & y_v \end{array} \right| = \left| \begin{array}{cc} 1/2 & 1/2 \\ -1/2 & 1/2 \end{array} \right| = \frac{1}{2}$. Now the trickest part is to find what the image of $R_{xy}$ is under this transformation. Well, since these are linear functions we would expect the new region to look similar to the old one, i.e. a triangle. Books tend to look where the vertices get mapped and connecting them with lines. I favor a more strict approach to this because if these are not such nice functions the new figure will be severly deformed. Look at the $R_{xy}$ below, note we can think of this region as the following system of inequalities $x\geq 0 \mbox{ and }y\geq 0 \mbox{ and }x+y \leq 1$. Substitute our new defined variables into this system to get $\frac{v+u}{2} \geq 0 \mbox{ and } \frac{v-u}{2} \geq 0 \mbox{ and } v \leq 1$. Thus, $v \geq - u \mbox{ and }v\geq u \mbox{ and }v\leq 1$. The resulting new region $R_{uv}$ is shown below. Hence the integral becomes $\frac{1}{2} \iint_{R_{uv}} \frac{u^3}{v^3} dA = \frac{1}{2} \int_0^1 \int_{-v}^v u^3v^{-3} du\, dv$. Now this is easily computable.

That is basically it. I just want to caution that if you get something like $x=v+u^2$ and $y=v^2+u$ then the resulting transformation will deform the shape, because it is not linear. Hence if you have a nice rectangle it might end up as parabolas. So the main difficulty is determining the new region $R_{uv}$. If you use the approach above by writing the old region as a system of inequalities then it should work well.

Example 2: Let $R_{xy}$ be described as $0\leq x \leq 1$ and $0\leq y\leq 1$, i.e. a square. Then the transformation $x=u+v$ and $y=u^2+v^2$ yield $0\leq u+v\leq 1 \mbox{ and } 0\leq u^2+v^2 \leq 1$. This strange transformation is shown below.

The generalized Change of Variable formula is similar. Instead we have $f(x,y,z)$ we have $x=u(x,y,z)$ and $y=v(x,y,z)$ and $z=w(x,y,z)$. And then we need to find $R_{uvw}$, that is, the transformation in space which is much more difficult and furthermore the Jacobian still follows the same pattern but it becomes a $3\times 3$ determinant.

Another Look at Polar Coordinates
The Jacobian provides a rigorous explanation of changing coordinates to polar form. Say we are integrating some function over the unit circle. In polar form we have $0\leq \theta \leq 2\pi \mbox{ and }0\leq r\leq 1$. But when we substitute the integral we change $dx \, dy$ to $r\, dr\, d\theta$. Note a multiple of $r$ appears in the expression and Calculus students are warned not to forget it. But where does it come from? The answer is the polar transformation of the region. If we let $x = r\cos \theta$ and $y=r\sin \theta$ then $R_{\theta r} = \{(\theta,r)|0\leq \theta \leq 2\pi \mbox{ and }0\leq r \leq 1\}$.
That is we get a rectangle in the $\theta r$-plane. Now let us compute the Jacbobian $\left| \begin{array}{cc} -r\sin \theta & \cos \theta \\ r\cos \theta & \sin \theta \end{array} \right| = -r\sin^2 \theta - r\cos^2 \theta = -r$. But remember we chose the absolute value of the Jacobian, thus, $|-r|=r$. And that is where this factor appears from.

Similarly in Spherical Coordinate the Change of Variable: $x = \rho\cos\theta \sin \phi \mbox{ and }y=\rho \sin \theta \sin \phi\mbox{ and } \sin z=\rho \cos \phi$. Its absolute Jacobian is $\left| \frac{\partial (x,y,z)}{\partial (\rho,\theta,\phi)} \right| = \rho^2 \sin \phi$. But the messy determinant details are omitted. This example shows where this factor appears after a conversion to Spherical Coordinates.

Integrating Over an Ellipse
Integrating over a circle centered at the origin is ideal at times if converted to polar form. But what can we do if the region of integration is an ellipse $\frac{x^2}{a^2}+\frac{y^2}{b^2} \leq 1 \mbox{ with }a,b>0$? Can we nicely write this in polar form? The answer is yes (otherwise I would not mention it). Remember we used the Jacobian to simplify the integrand. But we can use a different tactic. Instead of simplifing the integrand we simplify the region of integration. Define $u=\frac{x}{a} \mbox{ and }v=\frac{y}{b}$. That means $x=au \mbox{ and }y=bv$, then the new region shall be $u^2+v^2\leq 1$ a unit circle at the origin, excellent! And what about the Jacobian? $\frac{\partial(x,y)}{\partial(u,v)} = \left| \begin{array}{cc} a & 0 \\ 0 & b\end{array} \right| = ab$.

Example 3: Consider $\iint_{R_{xy}} x^2+y^2 dA$ where $R_{xy}$ is the ellipse $\frac{x^2}{2^2}+\frac{y^2}{1^2} \leq 1$. Then the variable subsitution $x = 4u \mbox{ and } v =y$ will transform the integral into $\iint_{R_{uv}} [16u^2+v^2 ]\cdot (4\cdot 1) dA$. Which is easier to integrate, because we can easily express $R_{uv}$ in polar form unlike in the beginning.

Integrating Over a Shifted Circle
Same idea, instead we have a disk $(x-x_0)^2+(y-y_0)^2 \leq r^2 \mbox{ with }r>0$ as $R_{xy}$. If we define $u=x-x_0 \mbox{ and }v=y-y_0$ then this transforms the region of integration to $u^2+v^2\leq r^2$ which is a disk centered at the origin, i.e. much more pleasant to integrate. Thus, the Change of Variable was $x = u+x_0 \mbox{ and }y=v+y_0$. Note the Jacobian is, $\frac{\partial (x,y)}{\partial (u,v)} = \left| \begin{array}{cc} 1 & 0 \\ 0& 1 \end{array} \right| = 1$. Hence we can simply preform a parallel coordinate shift without changing anything in the integral, as expected.

Example 4: Consider $\iint_{R_{xy}} f(x,y) dA$ where $R_{xy}$ is the disk $(x-1)^2+(y+1)^2 \leq 4$. The Change of Variables $x = u+1 \mbox{ and }v=v-1$ transforms the integral into $\iint_{R_{uv}} f(u+1,v-1)dA$ where $R_{uv}$ is the disk $u^2+v^2 \leq 4$.

Excersices
~~~

1) $R_{xy}=\{|x|\leq 1 \mbox{ and }|y|\leq 1\}$, compute: $\iint_{R_{xy}} (x-y)^5(x+y)^{10} dA$

2) $R_{xy}$ same as Example 1, compute: $\iint_{R_{xy}} (x+y)e^{x^2-y^2}dA$

3)Sometimes it might be convinent to rotate the region of integration. The reader probably knows that the rotation formula by angle $\theta$ is given by:
$\left\{ \begin{array}{c} x=u\cos \theta - v\sin \theta \\ y= u\sin \theta + v\cos \theta \end{array} \right\}$.
Compute the Jacobian.

2. This is very useful information, but there are a few things I feel I must point out.

Originally Posted by ThePerfectHacker
$\left| \begin{array}{cc} \frac{\partial x}{\partial u} & \frac{\partial x}{\partial v} \\ \frac{\partial y}{\partial u} & \frac{\partial y}{\partial v} \end{array} \right| = \frac{\partial x}{\partial y}\cdot \frac{\partial y}{\partial v} - \frac{\partial y}{\partial u} \cdot \frac{\partial x}{\partial v}$

$= \frac{\partial x}{\partial u}\cdot \frac{\partial y}{\partial v} - \frac{\partial y}{\partial u} \cdot \frac{\partial x}{\partial v}$.

You accidentally put $\frac{\partial x}{\partial y}$

Example 1: ...

Look at the $R_{xy}$ below, note we can think of this region as the following system of inequalities $x\geq 0 \mbox{ and }y\geq 0 \mbox{ and }x+y \leq 1$. Substitute our new defined variables into this system to get $\frac{v+u}{2} \geq 0 \mbox{ and } \frac{v-u}{2} \geq 0 \mbox{ and } v \leq 1$. Thus, $v \geq - u \mbox{ and }v\geq u \mbox{ and }v\leq 1$.
Will the transformation work any time I do this? And more than that, will the transformation be easy to work with any time I set up these inequalities? In your second example, you did a non-linear transformation, but you did not show how the new tranformation could be used to derive the domain.

Example 2: Let $R_{xy}$ be described as $0\leq x \leq 1$ and $0\leq y\leq 1$, i.e. a square. Then the transformation $x=u+v$ and $y=u^2+v^2$ yield $0\leq u+v\leq 1 \mbox{ and } 0\leq u^2+v^2 \leq 1$.
Can you finish this example just so I can varify what I believe the answer would be?

Integrating Over an Ellipse ....

Integrating over a circle centered at the origin is ideal at times if converted to polar form. But what can we do if the region of integration is an ellipse $\frac{x^2}{y^2}+\frac{y^2}{b^2} \leq 1 \mbox{ with }a,b>0$?
Another typo. You wrote: $\frac{x^2}{y^2}$ but clearly meant to write: $\frac{x^2}{a^2}$

3)Sometimes it might be convinent to rotate the region of integration. The reader probably knows that the rotation formula by angle $\theta$ is given by:
$\left\{ \begin{array}{c} x=u\cos \theta - v\sin \theta \\ y= u\sin \theta + v\cos \theta \end{array} \right\}$.
Compute the Jacobian.
I only learned about rotations when tutoring (I litterally learned it while tutoring and reading the student's text book). Of course I referred the student to another tutor for that particular help, but I read up on it for my own benefit and was later able to help other students in that class. But even still, I am somewhat limited in my understanding of rotations.

Anyways, nice tutorial. It was helpful and informative. (I had forgotten some of what you mentioned from when I learned it in Calculus 3).

3. Originally Posted by ecMathGeek
$= \frac{\partial x}{\partial u}\cdot \frac{\partial y}{\partial v} - \frac{\partial y}{\partial u} \cdot \frac{\partial x}{\partial v}$.
Fixed.

You accidentally put $\frac{\partial x}{\partial y}$
.
Apparently I had to much to drink when I was typing this. Next time I be more careful.

Will the transformation work any time I do this? And more than that, will the transformation be easy to work with any time I set up these inequalities? In your second example, you did a non-linear transformation, but you did not show how the new tranformation could be used to derive the domain.
I do not really understand the question. I was explaining that the nicest way to do this is through a system of inequalities which work all the time.
Can you finish this example just so I can varify what I believe the answer would be?
I did not compute a integral with this. I was just showing that non-linear transformations may deform a region.

Anyways, nice tutorial. It was helpful and informative. (I had forgotten some of what you mentioned from when I learned it in Calculus 3).
I hope you learned something.

This is just the beginning ....

4. What I wanted you to do for the second example is show what the limits of integration will become (even though there is not integration to be done).

5. Originally Posted by ecMathGeek
Will the transformation work any time I do this? And more than that, will the transformation be easy to work with any time I set up these inequalities? In your second example, you did a non-linear transformation, but you did not show how the new tranformation could be used to derive the domain.
I should have been a bit more clear in how I worded this question. What I meant to ask is will using inequalities in this way (as you did in that example) always be sufficient to determine the region of the newly transformed domain?

6. Originally Posted by ecMathGeek
What I wanted you to do for the second example is show what the limits of integration will become (even though there is not integration to be done).
That is a bad region to integrate. You need to break the region into two open disjoint regions and do each one seperately.

7. Originally Posted by ThePerfectHacker
That is a bad region to integrate. You need to break the region into two open disjoint regions and do each one seperately.
I was actually thinking that when I saw what the region looked like, but then this brings me back to my original question: can inequalities be used in every case to find the transformed domain? I will attempt something in a second given that example and you tell me if it looks correct.

Nevermind. What I intended to do only complicated the problem a lot. It would certainly not be worth attempting in a real situation where I would have to apply this.

8. Here is another useful techinque that appears in standard Calculus books but is not taught in a Calculus course. The method is for optimization for multivariable functions under a certain constraint. Here is an example, given $f(x,y)=1-x^2-y^2$ find what values $x,y$ optimize the function (bring it to maximum and minimum values) given the constraint that $x+y=2 \mbox{ with }x,y\geq 0$). Yes we can solve for one of the variables and substitute that into the function and treat this a single variable optimization problem by making the derivative equal to zero. But sometimes we cannot do that. Sometimes we do not want to do that. Here is a wonderful method that is due to a great mathemation, Joseph Louis Lagrange. (David Burton, Elementary Number Theory describes him nicely as "Italian by birth, Germany by adoption, French by choice").

Before I state the theorem let me mention the notation I shall use: $f(x,y)$ will represent the function that we wish to optimize, $g(x,y) = c$ shall represent the "contraint curve" which we describe as $C$, as in the example above $g(x,y)=x+y \mbox{ and }c=2$. As you know many analysis theorems require certain well-behaved conditions, here $f,g$ have continous first partial derivatives and $g$ is smooth. But we will not trouble ourselves with that.

Theorem (Lagrange): If $f$ has an extremum at $(x_0,y_0)$ satisfying the constraint curve. And if $\nabla g(x_0,y_0)\not = \bold{0}$ there is a number real $k$ so that:
$\nabla f(x_0,y_0) = k \nabla g(x_0,y_0)$.

Proof: We can represent $g(x,y)$ as a vector function $\bold{R}(t) = x(t)\bold{i}+y(t)\bold{j}$ for all $t$ in some open interval where we are working. Define the function $h(t) = f(x(t),y(t))$ for all $t$ on this interval and then use the multivariable chain rule to obtain:
$F'(t) = \frac{\partial f}{\partial x} \cdot \frac{dx}{dt} + \frac{\partial f}{\partial y}\cdot \frac{dy}{dt} = \nabla f(x(t),y(t)) \cdot \bold{R}'(t)$. Because $f(x,y)$ has an extremum as $(x_0,y_0)$ it follows that $F(t)$ has extremum at $t_0$. Hence, $F'(t_0)=0$ which means that $\nabla f(x(t_0),y(t_0)) \cdot \bold{R}'(t_0) = 0$. If $\nabla f(x(t_0),y(t_0))=\bold{0}$ then the theorem is proved, just pick $k=0$. If $\nabla f(x(t_0),y(t_0)) \not = \bold{0}$ then $f(x(t_0),y(t_0))$ is orthogonal to $\bold{R}'(t_0)$ (this is were we use that fact that the vector function is smooth, just in case you are interested). Since $\bold{R}'(t_0)$ is tangent to constraint curve $C$, it must mean that $f(x(t_0),y(t_0))$ is orthogonal to constraint curve $C$. But $\nabla g(x_0,y_0)$ is orthogonal to $C$ (that is the normal property about gradients, consult your Calculus book if you forgot). So $\nabla f(x_0,y_0)$ and $\nabla g(x_0,y_0)$ are both orthogonal (perpendicular) to $C$ at $(x_0,y_0)$, it means that $\nabla f(x_0,y_0) \mbox{ and }\nabla g(x_0,y_0)$ are parallel to each other. Parallel means they are scalar multiples, and hence, $\nabla f(x_0,y_0) = k \nabla g(x_0,y_0)$ for some $k$. This finishes the proof.

The above theorem shows how to do this. Say we have a function $f(x,y)$ with constraint $g(x,y)=c$, to find the extremum the necessary conditions say that: $\nabla f(x,y) = k\nabla g(x,y)$ for some $k$. Write this out in terms of vectors $\left< \frac{\partial f}{\partial x}, \frac{\partial f}{\partial x} \right> = k \left< \frac{\partial g}{\partial x}, \frac{\partial g}{\partial y} \right>$. This leads us to the following system of equations,
$\left\{ \begin{array}{c} f_x(x,y) = k g_x(x,y) \\ f_y(x,y) =k g_y(x,y) \end{array} \right\}$.
I know what you might be thinking, we have three variables $x,y,k$ but two equations. But we are forgetting the constraint equation $g(x,y)=c$. So we can at least feel safe that we can solve them.

Example 5: Let use solve the optimization problem posted above $f(x,y) = 1-x^2-y^2$ with $x+y=2 \mbox{ and }x,y\geq 0$. Note, $\nabla f = <-2x,-2y> \mbox{ and }\nabla g = <1,1>$. Thus, we need to solve,
$\left\{ \begin{array}{c} -2x = k \\ -2y = k \\ x+y =2 \end{array} \right\}$.
It is easy to see that $x=1 \mbox{ and } y = 1 \mbox{ and }k=-2$. The $k$ value we do not care about, expect for solving the equation. Hence a possible extremum point is $(1,1)$. As with standard optimization problem we need to check the endpoint here as well. The endpoints for $x+y=2 \mbox{ with }x,y \geq 0$ are $(0,2) \mbox{ and }(2,0)$. Now we evaluate $f(x,y)$ at $(1,1),(0,2),(2,0)$ and compare values to which one is the largest.

The proof above does not rely on the fact that $f$ is a function of two variables, indeed, it can be several variables. And the theorem generalizes: $\nabla f(x_1,x_2,...,x_n) = k \nabla g(x_1,x_2,...,x_n)$.

Example 6: Here is an example involving a function of three variables $f(x,y,z) = 2 - x^2 - y^2 - 2z^2$. Given that $0 \leq x,y,z\leq 1$ with constraint curve $g(x,y,z) = x+y+z \mbox{ and } c =1$. Note $g(x,y,z) = c$ that is, $x+y+z = 1$ is a plane. Since we are working in $0\leq x,y,z \leq 1$ it means the entire plane is in the first octant. We can think of this plane (constraint curve) as being described by its vertices $A(1,0,0) , B(0,1,0), C(0,0,1)$. Now Lagrange multipliers will tell us the possible extrema inside this boundary (i.e. plane). We will still have to check the edges.
We need to solve $\nabla f(x,y,z) = k \nabla g(x,y,z)$ meaning, $<-2x,-2y,-4z> = $. This leads use to the equations (including the constraint):
$\left\{ \begin{array}{c} -2x = k \\ -2y = k \\ -4z = k \\ x+y+z = 1\end{array} \right\}$. The solution is quite trivial $\left( \frac{2}{5}, \frac{2}{5} , \frac{1}{5} \right)$. However, we are not done. We still need to check the boundary itself. Now we have three edges of this boundary plane, $AB,AC,BC$. Well, edge $AB$ is described by the equation $x+y=1$ and $f(x,y,0) = 2 - x^2 - y^2$. We can use Lagrange multiplier again over here with function $2-x^2-y^2$ and constraint $x+y=1$, but instead it is easier to solve $y=1-x$ and substitute $AB(x) = 2-x^2 -(1-x)^2$ now we seek extrema on interval $0\leq x \leq 1$. Differenciate $AB'(x) = -4x+2$ and solve $-4x+2 = 0$ thus, $x = 1/2$, so $y = 1 - 1/2 = 1/2$ which implies a possible extremum point on $AB$ is $\left(\frac{1}{2},\frac{1}{2},0\right)$. But we are still not done with this edge! Because we still have to check its endpoints (remember in Calculus I you were required to check endpoints too, same idea here) which are $x = 0 \mbox{ and } x=1$ which correspond to $y=1 \mbox{ and }y=0$ respectively, thus $(0,1,0) \mbox{ and }(1,0,0)$ are the possible points. Now we will do edge $AC$. Edge $AC$ is described by the equation $x+z=1$ and $f(x,0,z) = 2 - x^2 - 2z^2$. Again we can use Lagrange multipliers on $2 - x^2 - 2z^2$ and constraint $x+z=1$ but instead it is easier to replace one of the variables as above. Thus, $AC(x) = 2 - x^2 - 2(1-x)^2$. To extremize this find $AC'(x) = -6x + 4$ and then solve $-6x + 4 =0$ thus $x = \frac{2}{3}$ which implies that $z = \frac{1}{3}$ which corresponds to point $\left(\frac{2}{3},0,\frac{1}{3}\right)$ on edge $AC$. But we still are not done with this edge, we need to consider the endpoints, $(0,0,1) \mbox{ and }(1,0,0)$. Finally we do edge $BC$. Again we will follow the same procedure as on top, this edge is described as $f(0,y,z) = 2 - y^2 - 2z^2$ with constraint curve $y+z=1$ solve $y=1-z$ and substitute $BC(y) = 2 - y^2 - 2(1-y)^2$, find $BC'(y) = -6y+4$ and solve $-6y+4=0$ thus $y = \frac{2}{3}$ which implies $z=\frac{1}{3}$ which corresponds to point $\left( 0, \frac{2}{3} , \frac{1}{3}\right)$ on edge $BC$. And let us not forget the endpoints $(0,0,1) \mbox{ and }(0,1,0)$. As a result we have the following critical points:
$\left\{ \begin{array}{c} (2/5,2/5,1/5) \\ (1/2 , 1/2 , 0) \\ (2/3,0,1/3) \\ (0,2/3,1/3) \\ (0,0,1) \\ (0,1,0) \\ (0,0,1) \end{array} \right\}$.
Now we just evaluate the function at each of these points to determine which one is largest and smallest in value.

The nice think about Lagrange Multipliers is that they can be extended to two constraint curves or more. When dealing with a function of two variables there is usually one constraint. When dealing with a function of three variables there can be one constraint, like in the example above, but it can also have a second constraint. So given $f(x,y,z)$ with constraints $g(x,y,z)=c_1 \mbox{ and }h(x,y,z)=c_2$ the method states that we look for solutions $(x,y,z)$ to the equation:
$\nabla f(x,y,z) = k_1 \nabla g(x,y,z) + k_2 \nabla h(x,y,z)$.
This is a much nastier equation to deal with. Because $x,y,z,k_1,k_2$ are the variables. The extended Lagrange equation provides us with three equations and the two constraints curves add to a total of five equations vs. five variables. Which is usually what we want, but still, it is much more computational to solve.

Example 7: We will not solve this, but merely set up the system. This system is solvable by analytic methods, but I am not going to go through this because I assume the reader is expericened enough to do that himself. Say we can to find which point on the intersection between the plane $x+y+z=1$ and paraboloid $z=x^2+y^2$ is closet to the origin. So the function we want to minimize is the distance function to the origin, $s(x,y,z) = \sqrt{x^2+y^2+z^2}$. I am sure you now the trick that it is easier to minimize the $s^2$ than rather $s$. So we wish to minimize $s(x,y,z) = x^2+y^2+z^2$. But the conditions are that the point $(x,y,z)$ lies both on the plane $x+y+z=1$ and paraboloid $z=x^2+y^2$. Meaning the constraints are $g(x,y,z) = x+y+z \mbox{ and }h(x,y,z) = z-x^2-y^2$ with $c_1=1 \mbox{ and }c_2=0$. Now we want to solve $\nabla f (x,y,z) = k_1\nabla g(x,y,z) + k_2 \nabla h(x,y,z)$. That is, (together with constraint equations):
$\left\{ \begin{array}{c} 2x = k_1 - 2k_2 \\ 2y = k_1 - 2k_2 \\ 2z = k_1+k_2 \\ x+y+z = 1 \\ z-x^2-y^2 = 0\end{array} \right\}$.
Here, the nice thing is that unlike in the other examples were we checked endpoints here it is one continous loop, so no endpoints. And the solution(s) to this equation will produce the the minimal value.

Excercises
~~~
1) Maximize $f(x,y) =xy$ subject to $2x+2y=5$.

2) Maximize $f(x,y) =9-x^2-y^2$ subject to $x+2y=6$.

3) Given plane $2x+y+z=2$ which point is closest to origin.

4) Using Lagrange multipliers show the famous Arithmetic Mean-Geometric Mean inequality.

9. The next series of lessons will be devoted to the basics of Vector Calculus (or Analysis). Just a note: when physicists/engineers say "field theory" they refer to Vector Field Theory (this). When mathematicians say "field theory" they usually refer to something totally and completely different. I am just saying that because once I was in confusion with someone, he was thinking about Vector Calculus and I was thinking about something else.

So what is Vector Calculus? The author of my book (Bradely and Smith) puts it as "the marriage between vectors and calculus". It is merely doing Calculus involving vectors instead of real numbers. Vector Calculus is of extreme importance in applied mathematics, especially in fluid mechanics. One of the great \$1,000,000 math problems is called "Navier-Stokes Equation", they can be regarded in some sense as being part of Vector Calculus. I read an interesting comment from a book (Joos) that says "...vector analysis is a field largely developed by physicits...". I am not sure how true that is but it may be true that the initial concepts were created by physicists and later improved by mathematicians.

There is some strange looking notating in Vector Calculus, we will not really use it that much but I will mention it just in case it comes across from reading it somewhere else. First we need to say what is a vector field. In Multivariable Calculus the reader perhaps remembers what a "vector function" is, i.e. a function which maps a number into a vector. A vector field is similar.

Definition: A "vector field" is a multivariable function mapping a pair of numbers (or a triple) into a vector.

Example 8: I was not so rigorous with the above definition, instead, I think an example will be better. $\bold{F}(x,y) = 2xy\bold{i} + xy^2\bold{j}$. Notice what is going on. The notation $\bold{F}(x,y)$ represents a vector field, and $x,y$ show it is a vector field of two variables. The expression on the right is the actual formula that tells us what the vector is. So $\bold{F}(1,1) = 2\bold{i}+ \bold{j}$. You can also have $\bold{F}(x,y) = x\bold{i}+y\bold{j}+xyz\bold{k}$, this one is based on three variables.

The notation that we will use is: $\bold{F}(x,y) = u(x,y)\bold{i}+v(x,y)\bold{j}$ for two variables and
$\bold{F}(x,y,z) = u(x,y,z)\bold{i}+v(x,y,z)\bold{j}+w(x,y,z)\bold{k}$ for three variables..

Vector fields are really useful in physics. Here is a a picture. The vector field equation here is $\bold{F}(x,y) = -y\bold{i}+x\bold{j}$. Geometrically it means this: Pick a point, say $(1,2)$ then the corresponding vector is $\bold{F}(1,2) = -2\bold{i}+\bold{k}$, which just means draw this vector (with correct direction and magnitutde) at that point. Now do that for many many vectors (just like plotting graphs) and you get this nice looking picture. As I said these vector fields are really useful in physics. For example, this can represent a fluid flow. Like in that diagram it seems that the fuild is originating from some source and flowing out in a circular path with more force as the distance increases. Another instance are charts that show the movement of air speed and direction. Again they can be described by these vector field diagrams. Powerful math software such as MatLab can graph these vector field graphs.

Now we get to some strange notation that exists in Vector Analysis. First $\mbox{grad}$ ,it means "gradient" and is the same operation as $\nabla$. So if $f(x,y)=x^2+y^2$ then $\mbox{grad} (f) = 2x\bold{i}+2y\bold{j}$. Note the important thing that the grad operator transforms a multivariable function into a vector field. Now there is another operation that does the opposite, i.e. transforms a vector field into a multivariable function, it is called the "divergence" and written as $\mbox{div}$.

Definition: If $\bold{F}(x,y) = u(x,y)\bold{i}+v(x,y)\bold{j}$ then $\mbox{div}\bold{F} = \frac{\partial u}{\partial x} + \frac{\partial v}{\partial y}$. Anagolously, in three variables, if $\bold{F}(x,y,z) = u(x,y,z)\bold{i}+v(x,y,z)\bold{j}+w(x,y,z)\bold{k}$ then $\mbox{div}\bold{F} = \frac{\partial u}{\partial x} + \frac{\partial v}{\partial y} + \frac{\partial w}{\partial z}$.

This div operator has some applications in fluid mechanics (not that I am an expert on it). It can be used to test how a fluid behaves. If the divergence is positive then the fluid is emerging from a source point. If the divergence is negative it means the fluid is leaving from a point (such as a hole in the bathtub) and if the divergence is zero then it gets a special name called "incompressible" fluid. That is, a fluid which just sits there.

A scary looking operation is called the "circulation". And written as $\mbox{curl}$. Notice, it takes a vector field and maps it into another vector field.

Definition: Given $\bold{F}(x,y,z)=u(x,y,z)\bold{i}+v(x,y,z)\bold{j}+ w(x,y,z)\bold{k}$ then $\mbox{curl}\bold{F} = \left( \frac{\partial w}{\partial y} - \frac{\partial v}{\partial z}\right)\bold{i}+ \left( \frac{\partial u}{\partial z} - \frac{\partial w}{\partial x} \right) \bold{j} + \left( \frac{\partial v}{\partial x} - \frac{\partial u}{\partial y} \right) \bold{k}$.

Wow! How do we memorize that. Do not worry, I will introduce another operator which will make the pattern appear simple both for the divergence formula and the circulation formula. Now the circulation is also used in fluid mechanics. When the circulation is zero everywhere it is sometimes called "irrotational", which physically means the fluid is not spinning, i.e. rotating. Otherwise it is.

In Vector Analysis there is a useful symbol called the "del operator" which in reality is a non-sense symbol but extremely useful. $\nabla = \frac{\partial }{\partial x}\bold{i}+\frac{\partial }{\partial y}\bold{j}$. Why is it non-sense? Because we have a partial derivative there without any function there to differenciate, think of it as writing a square root but not putting anything inside to extract. And the three dimensional analogue is the same, i.e., $\nabla = \frac{\partial }{\partial x}\bold{i} + \frac{\partial }{\partial y}\bold{j} + \frac{\partial }{\partial z}\bold{k}$. So say we want to find the divergence, we can write, $\mbox{div}\bold{F} = \nabla \cdot \bold{F}$. Note: this is not the grad of $\bold{F}$ it is the del operator dot producted with the vector field. Though, this operation does not really make sense if we do it, but still note what we get:
$\nabla \cdot \bold{F} = \left( \frac{\partial }{\partial x}\bold{i}+\frac{\partial }{\partial y}\bold{j} \right) \cdot \left( u(x,y)\bold{i} +v(x,y)\bold{j} \right) = \frac{\partial u}{\partial x} + \frac{\partial v}{\partial y} = \mbox{div}\bold{F}$. This was only the two dimensional case, the same idea is with the three dimensional vector field.

The del operator can also be used to remember the curl form. Look at what happens if we preform the non-sense operation:
$\nabla \times \bold{F} = \left| \begin{array}{ccc} \bold{i}&\bold{j}&\bold{k} \\ \partial / \partial x & \partial / \partial y & \partial / \partial z \\ u & v & w \end{array} \right|$ $= \left( \frac{\partial w}{\partial y} - \frac{\partial v}{\partial z}\right)\bold{i}+ \left( \frac{\partial u}{\partial z} - \frac{\partial w}{\partial x} \right) \bold{k} + \left( \frac{\partial v}{\partial x} - \frac{\partial u}{\partial y} \right) \bold{k} = \mbox{curl }\bold{F}$.
Here I used a three dimensional vector field because the cross product is used for three dimensional vectors only.

The last final operation is called the "del squared" operation. It merely is the composition $\mbox{div}(\mbox{grad} u ) = \mbox{div} \left( \frac{\partial u}{\partial x}\bold{i}+ \frac{\partial u}{\partial y}\bold{j} \right) = \frac{\partial ^2 u}{\partial x^2} + \frac{\partial ^2 u}{\partial y^2}$. It gets its name because if we write grad as $\nabla u$ and div as the del operator then we have: $\nabla \cdot \nabla u = \nabla^2 u$, though, the del's are not numbers and we cannot multiply them we just write them as a shorthand notation. This is also called the "Laplacian Operator", and any function (function!, not vector field) $u(x,y)$ which satisfies $\nabla^2 u = 0$, that is, $\frac{\partial^2 u}{\partial x^2} + \frac{\partial ^2 u}{\partial y^2} = 0$, i.e. the "Laplace equation" is called a "harmonic function". These harmonic functions and Laplacian Operator's are so important in mathematical physics, they appear in almost every partial differencial equation.

Example 9: Let $\bold{F}(x,y) = (x+y)\bold{i} + (x^2-y^2)\bold{k}$.
Then $\mbox{div}\bold{F} = \nabla \cdot F = \frac{\partial (x+y)}{\partial x}+ \frac{\partial (x^2-y^2)}{\partial y} = 1-2y$

Example 10: Let $\bold{F}(x,y,z) = x\bold{i}+y\bold{j}+z\bold{k}$ then $\mbox{curl}\bold{F} = \nabla \times \bold{F} = \left| \begin{array}{ccc}\bold{i}&\bold{j}&\bold{k} \\ \partial/\partial x& \partial/\partial y& \partial/\partial z\\ x&y&z \end{array} \right|$ = $0\bold{i}+0\bold{j}+0\bold{k}=\bold{0}$. Hence, in this case $\bold{F}$ is irrotational.

There really is nothing so interesting in this lecture, I just stated it because these notations do appear and so the reader will know what they mean.

Excercises
~~~
Proof the identities.

1) $\mbox{curl}(\bold{F}+\bold{G})=\mbox{curl}\bold{F} + \mbox{curl}\bold{F}$
2) $\mbox{curl}(f\bold{F})=f\mbox{curl}\bold{F}+(\mbox {grad}f \times \bold{F})$
3) $\mbox{div}(f\bold{F})=f\mbox{div}\bold{F}+(\mbox{g rad}f\cdot \bold{F})$
4) $\mbox{curl}(\mbox{grad}\bold{F})=\bold{0}$
5) $\mbox{div}(\mbox{curl}\bold{F})=0$

10. Two questions:

You are mostly correct in your comments about "field theory" as used by Physicists, but in Quantum Field Theory a field is not merely a vector field, but (physically) a space-time continuous set of oscillators that obey certain commutation (or anti-commutation) relations. I'm not at all certain of the precise Mathematical definition of such constructs. But I will say that even this does not match the Mathematical definition of a field. As I have never been able to find a definition of a field that I've been able to understand, could you take a moment to explain this?

Also, the grad and div operators have a fairly simple meaning in terms of what they are in reference to the function they operate on. I have never run across a clear meaning of what the curl operator does/means in reference to the function it operates on. Do you have any insights to share on this?

Thanks!

-Dan

11. Originally Posted by topsquark
You are mostly correct in your comments about "field theory" as used by Physicists, but in Quantum Field Theory a field is not merely a vector field, but (physically) a space-time continuous set of oscillators that obey certain commutation (or anti-commutation) relations.
I do not know what this is, sorry. Maybe I should correct my sentences somehow.

I'm not at all certain of the precise Mathematical definition of such constructs. But I will say that even this does not match the Mathematical definition of a field. As I have never been able to find a definition of a field that I've been able to understand, could you take a moment to explain this?
Yes, I love field theory!

Definition: A "ring" $$ has two binary operations $+,\cdot$, called addition and multiplication respectively. Which satisfies the following:

1)If $a,b\in R$ then $a+b,a\cdot b\in R$.

2)There exists $0$ such that $a+0=0+a$ for all $a\in R$.

3)The associate laws: $(a+b)+c = a+(b+c) \mbox{ and }a(bc) = (ab)c$

4)The commutative law: $a+b=b+a$

5)Distributive laws: $a(b+c)=ab+ac \mbox{ and }(b+c)a=ba+ca$

6)For each $a$ there exists $-a$ such that $a+(-a)=(-a)+a=0$.

(Instead of 1,2,3,4,6 you can remember that $$ is an abelian group.)

Example: The integers $\mathbb{Z}$ under addition and multiplication are a ring.

Example: The set $\{x\}$ is a ring. A very trivial ring.

Definition: A "unitary ring" is a which has an element $1$ such that $a1=1a=a$.

Example: The set $\{x\}$ is a unitary ring. In fact, $1=0$, i.e. the additive identity element is equal to the multiplicative identity element. This never ever happens except for this trivial case involving a ring with only one element.

Definition: A "commutative ring" is a ring such that $ab=ba$.

Definition: A "unit" is an element $r$ in a unitary ring such that there exists $r^{-1}$ so that $rr^{-1}=r^{-1}r=1$

Remark: Note if $r=0$ then there is not $r^{-1}$ unless $1=0$ i.e. trivial ring. (This is why division by zero is forbidden.)

Definition: A "division ring" is a unitary ring such that all non-zero elements are units.

Definition: A non-commutative division ring is called "skew field". The classic example is $\mathbb{H}$, i.e. Quaternions.

Definition: A communist division ring called a "field".

Also, the grad and div operators have a fairly simple meaning in terms of what they are in reference to the function they operate on. I have never run across a clear meaning of what the curl operator does/means in reference to the function it operates on. Do you have any insights to share on this?
I do not understand.

12. Hmmm...

Well I already knew what a field was, but I had thought there was more to "field theory" than just fields, if that makes any sense. Perhaps what's been confusing me is that the example I saw (in a Mathematical Physics paper) was using theorems unfamiliar to me to determine the structure of the field. I'll have to dig it out and look at it again.

grad(f) gives information on how fast the function is changing in a given direction

div(f) gives information on the flux of the function through a surface.

curl(f) gives ??? I once heard someone make a comment about how the function "rotates" as we parametrically move along it, but this makes little sense to me.

-Dan

13. Originally Posted by topsquark
Hmmm...
Well I already knew what a field was, but I had thought there was more to "field theory" than just fields, if that makes any sense.
Yes. Field theory is nothing like Group theory. Field theory a lot of times is not a study of fields, it spends a lot of time studing polynomial rings (produces from fields).

Perhaps what's been confusing me is that the example I saw (in a Mathematical Physics paper) was using theorems unfamiliar to me to determine the structure of the field.
I be happy to look at it. I hope I can understand it and explain it.

curl(f) gives ??? I once heard someone make a comment about how the function "rotates" as we parametrically move along it, but this makes little sense to me.
I think the curl (i.e. circulation) measures the rotation of a fluid at a point. And if we take the norm of this vector ||curl(F) || then we get the magnitute of the tendency to rotate. That is all I know about fluid mechanics.

14. This lecture will be concerned with integration along certain paths, the basis of vector analysis. I will state right now that I will use my own terminology because I want to be more specific. Books do not distinguish between the terms "line integral", "path integral" and "contour integral", these terms are used synonymously. However, I will distinguish between them.

Line Integrals
Given a continous multivariable function $z=f(x,y)$ and a closed smooth curve $C$ described parametrically in the $xy-$plane as $\bold{R}(t) = x(t)\bold{i}+y(t)\bold{j}$ we define the "line integral of $f(x,y)$ along curve $C$ with respect to $x$" as $\int_C f(x,y) dx$ and its value is given by $\int_a^b f(x(t),y(t))x'(t)dt$. Similarly the same line integral "with respect to $y$" is $\int_C f(x,y)dy = \int_a^b f(x(t),y(t))y'(t)dt$. We can also have one in three dimensions for $f(x,y,z)$ and curve $\bold{R}(t) = x(t)\bold{i}+y(t)\bold{j}+z(t)\bold{k}$ then $\int_C f(x,y,z)dz = \int_a^b f(x(t),y(t),z(t))z'(t)dt$.

When we say "line integral" it will be assumed that we are integrating some continous multivarible function along some smooth curve. With line integrals the same properties of integration hold: sum of line integrals are the line integrals of the sums, constants term can be factored outside the line integral....

Example 11: Given the function $f(x,y)=x^2+xy$ and the curve defined at $\bold{R}(t) = t\bold{i}+t^2\bold{k} \mbox{ for }0\leq t\leq 1$ i.e. we are integrating this surface along a path of a parabola. Then, the line integral with respect to $x$ is given by:
$\int_C f(x,y)dx = \int_0^1 f(t,t^2)\cdot (t)' dt = \int_0^1 t^2+t^3 dt$.
The line integral with respect to $y$ is given by:
$\int_C f(x,y)dy = \int_0^1 f(t,t^2)\cdot (t^2)' dt = \int_0^1 2t(t^2+t^3)dt$.
Whatever these numbers these line integrals will give the answer.

An "opposite orientation" is the curve taken in an opposite direction. For example, $\bold{R}(t) = \cos t\bold{i}+\sin t\bold{j} \mbox{ for }a\leq t\leq b$ is a unit circle counterclockwise. While $\bold{R}(t) = \sin t \bold{i}+\cos t \bold{j} \mbox{ for }a\leq t\leq b$ is the same curve only taken clockwise, in an opposite direction. The result that we need to know is that if we take the line integral along an opposite orientation we get the same result as taking it normally except for a minus sign.

Path Integrals
The term "path integral" will apply to integrating a vector field rather than a multivarible function. Let use begin with a two dimensional vector field $\bold{F}(x,y) = u(x,y)\bold{i}+v(x,y)\bold{j}$ and a curve $C$ which is desribed parametrically by the vector function $\bold{R}(t) = x(t)\bold{i}+y(t)\bold{k} \mbox{ for }a\leq t\leq b$.

Definition: The "path integral of $\bold{F}(x,y)$ along $C$" is: $\int_C \bold{F} \cdot d\bold{R} = \int_C u(x,y) dx + \int_C v(x,y) dy$.

Thus, a path integral is simply a sum of two line integrals, one is to the respect of $x$ and the other is to the respect of $y$. We can use the following mnemonic. Think of $d\bold{R} = dx\bold{i}+dx\bold{j}$. Then $\bold{F} \cdot d\bold{R}$ we can think of as a dot product: $(u(x,y)\bold{i}+v(x,y)\bold{j})\cdot (dx\bold{i}+dy\bold{k}) = u(x,y)dx+v(x,y)dy$. And when we integrate this $\int_C u(x,y) dx + v(x,y) dy$ we do each one seperately: $\int_C u(x,y) dx + \int_C v(x,y)dy$, which is what we get above.

Example 12: Given the vector field $\bold{F}(x,y) = x^2\bold{i}+y^2\bold{j}$ compute the path integral along the line segment joining $(0,0) \mbox{ to }(1,1)$. Here we need to be careful with orientation, as mentioned above if we do it backwards we get a negative result. But the problem says from $(0,0)$ to $(1,1)$. It is simple to see that $\bold{R}(t) = t\bold{i}+t\bold{j} \mbox{ for }0\leq t\leq 1$ is a vector parametrization for this path. Then,
$\int_C (x^2\bold{i}+y^2\bold{j}) \cdot d\bold{R} = \int_C x^2 dx + \int_C y^2 dy = \int_0^1 (t^2)(t)'dt + \int_0^1 (t^2)(t)'dt$.

Example 13: In Example 12, if instead we wanted a parabolic segment from $(0,0)\mbox{ to }(1,1)$ then the parametrization would be $\bold{R}(t) = t\bold{i}+t^2\bold{j} \mbox{ for }0\leq 1$. Note this is one possible parametrization out of many different possible ones, all of them work (as long as the orientation is preserved). Thus, the answer would be,
$\int_C (x^2\bold{i}+y^2\bold{j})\cdot d\bold{R} =\int_C x^2 dx + \int_C y^2 dy = \int_0^1 (t^2)(t)'dt + \int_0^1 (t^2)^2 (t^2)' dt$.

These, path and line integrals just discussed has some useful applications. One place where path integrals are used if for computing the total amount of work. They are aslo used much in mathematical physics, especially in fluid mechanics.

Example 14: We can view Example 13, as a work problem. An object is moving along the parabolic path, and at each point the vector field determines the force vector. The sum (think of this as the infinitesmial sum) of all those force vectors add up to the total amount of work done by this object.

Contour Integrals
A path which is taken in a positively oriented, simply connected (path does not cross itself), piecewise smooth, closed curve shall be referred to a "contour". If the reader has taken a course in Complex Analysis he might remember the similar definition as with complex numbers. So basically a contour is a closed loop which does not cross itself. The computations are exactly the same as with path integrals, however, contour integrals have some nice properties that deserve them a special name. These properties will be explored later in this tutorial.

Definition: Let $C$ denote a contour, then $\oint_C \bold{F}\cdot d\bold{R}$ is simply the path integral around $C$ for the vector field $\bold{F}$.

Example 15: Consider $C$ a unit circle and $\bold{F}(x,y) = \sin x \bold{i}+\cos y\bold{j}$, then $\bold{R}(t) = \cos t \bold{i}+\sin t\bold{j} \mbox{ for }0\leq t\leq 2\pi$.
Thus,
$\oint_C (\sin x\bold{i}+\cos y\bold{j} ) \cdot d\bold{R} = \oint_C \sin x dx + \oint_C \cos y dy$
$\int_0^{2\pi} \sin (\cos t) (\cos t)' dt + \int_0^{2\pi} \cos (\sin t) (\sin t)' dt = 0$

Excercises
~~~
1)Find the path integral: $\int_C (-ydx + xdy)$ where $C$ is the parabolic path $y=2x^2$ from $(0,0) \mbox{ to }(1,2)$.

2)Find the path integral: $\oint_C \frac{xdy - ydx}{x^2+y^2}$ where $C$ is the unit circle (positively orientied).

3)Let $\bold{R}(t) = x(t)\bold{i}+y(t)\bold{j} \mbox{ for }a\leq t\leq b$ be a parametrization. Note that $\bold{R}'(t) = x(a+b-t)\bold{i}+y(t)\bold{j} \mbox{ for }a\leq t\leq b$ is the same parametrization but in the opposite orientation. Show that this orientation changes the value of the line integral.

15. Originally Posted by ThePerfectHacker
Definition: A communist division ring called a "field".
Is the word "communist" up there a typo or a joke?

Page 1 of 2 12 Last