CURVES IN R - Multivariable Differential Calculus - Advanced Calculus of Several Variables

Advanced Calculus of Several Variables (1973)

Part II. Multivariable Differential Calculus

Our study in Chapter I of the geometry and topology of Image provides an adequate foundation for the study in this chapter of the differential calculus of mappings from one Euclidean space to another. We will find that the basic idea of multivariable differential calculus is the approximation of nonlinear mappings by linear ones.

This idea is implicit in the familiar single-variable differential calculus. If the function Image is differentiable at a, then the tangent line at (a, f(a)) to the graph y = f(x) in Image is the straight line whose equation is

Image

The right-hand side of this equation is a linear function of x − a; we may regard

Image

Figure 2.1

it as a linear approximation to the actual change f(x) − f(a) in the value of f between a and x. To make this more precise, let us write h = x − a, Δfa(h) = f(a + h) − f(a), and dfa(h) = f′(a)h (see Fig. 2.1). The linear mapping Image, defined by dfa(h) = f′(a)h, is called the differential of f at a; it is simply that linear mapping Image whose matrix is the derivative f′(a) of f at a (the matrix of a linear mapping Image being just a real number). With this terminology, we find that when h is small, the linear change dfa(h) is a good approximation to the actual change Δfa(h), in the sense that

Image

Roughly speaking, our point of view in this chapter will be that a mapping Image is (by definition) differentiable at a if and only if it has near a an appropriate linear approximation Image. In this case dfa will be called the differential of f at a; its (m × n) matrix will be called the derivative of f at a, thus preserving the above relationship between the differential (a linear mapping) and the derivative (its matrix). We will see that this approach is geometrically well motivated, and permits the basic ingredients of differential calculus (for example, the chain rule, etc.) to be developed and utilized in a multivariable setting.

Chapter 1. CURVES IN R

We consider first the special case of a mapping Image. Motivated by curves in Image and Image, one may think of a curve in Image, traced out by a moving point whose position at time t is the point Image, and attempt to define its velocity at time t. Just as in the single-variable case, m = 1, this problem leads to the definition of the derivative f′ of f. The change in position of the particle from time a to time a + h is described by the vector f(a + h) − f(a), so the average velocity of the particle during this time interval is the familiar-looking difference quotient

Image

whose limit (if it exists) as h → 0 should (by definition) be the instantaneous velocity at time a. So we define

Image

if this limit exists, in which case we say that f is differentiable at Image. The derivative vector f′(a) of f at a may be visualized as a tangent vector to the image curve of f at the point f(a) (see Fig. 2.2); its length Imagef′(a)Image is the speed at time t = a of the moving point f(t), so f′(a) is often called the velocity vector at time a.

Image

Figure 2.2

If the derivative mapping Image is itself differentiable at a, its derivative at a is the second derivative f″ (a) of f at a. Still thinking of f in terms of the motion of a moving point (or particle) in Image, f″(a) is often called the acceleration vector at time a. Exercises 1.3 and 1.4 illustrate the usefulness of the concepts of velocity and acceleration for points moving in higher-dimensional Euclidean spaces.

By Theorem 7.1 of Chapter I (limits in Image may be taken coordinatewise), we see that Image is differentiable at a if and only if each of its coordinate functions f1, . . . , fm is differentiable at a, in which case

Image

That is, the differentiable function Image may be differentiated coordinate-wise. Applying coordinatewise the familiar facts about derivatives of real-valued functions, we therefore obtain the results listed in the following theorem.

Theorem 1.1 Let f and g be mappings from Image to Image, and Image, all differentiable. Then

Image

Image

Image

and

Image

Formula (5) is the chain rule for the composition Image.

Notice the familiar pattern for the differentiation of a product in formulas (3) and (4). The proofs of these formulas are all the same—simply apply componentwise the corresponding formula for real-valued functions. For example, to prove (5), we write

Image

applying componentwise the single-variable chain rule, which asserts that

Image

if the functions f, Image are differentiable at g(t) and t respectively.

We see below (Exercise 1.12) that the mean value theorem does not hold for vector-valued functions. However it is true that two vector-valued functions differ only by a constant (vector) if they have the same derivative; we see this by componentwise application of this fact for real-valued functions.

The tangent line at f(a) to the image curve of the differentiable mapping Image is, by definition, that straight line which passes through f(a) and is parallel to the tangent vector f′(a). We now inquire as to how well this tangent line approximates the curve close to f(a). That is, how closely does the mapping h → f(a) + hf′(a) of Image into Image (whose image is the tangent line) approximate the mapping h → f(a + h)? Let us write

Image

for the actual change in f from a to a + h, and

Image

for the linear (as a function of h) change along the tangent line. Then Fig. 2.3 makes it clear that we are simply asking how small the difference vector

Image

Figure 2.3

Δfa(h) − dfa(h) is when h is small. The answer is that it goes to zero even faster than h does. That is,

Image

by the definition of f′(a). Noting that Image is a linear mapping, we have proved the “only if” part of the following theorem.

Theorem 1.2 The mapping Image is differentiable at Image if and only if there exists a linear mapping Image such that

Image

in which case L is defined by L(h) = dfa(h) = hf′(a).

To prove the “if” part, suppose that there exists a linear mapping satisfying (6). Then there exists Image such that L is defined by L(h) = hb; we must show that f′(a) exists and equals b. But

Image

by (6).

Image

If Image is differentiable at a, then the linear mapping Image, defined by dfa(h) = hf′(a), is called the differential of f at a. Notice that the derivative vector f′(a) is, as a column vector, the matrix of the linear mapping dfa, since

Image

When in the next section we define derivatives and differentials of mappings from Image to Image, this relationship between the two will be preserved—the differential will be a linear mapping whose matrix is the derivative.

The following discussion provides some motivation for the notation dfa for the differential of f at a. Let us consider the identity function ImageImage, and write x for its name as well as its value at x. Since its derivative is 1 everywhere, its differential at a is defined by

Image

If f is real-valued, and we substitute h = dxa(h) into the definition of Image, we obtain

Image

so the two linear mappings dfa and f′(a) dxa are equal,

Image

If we now use the Leibniz notation f′(a) = df/dx and drop the subscript a, we obtain the famous formula

Image

which now not only makes sense, but is true! It is an actual equality of linear mappings of the real line into itself.

Now let f and g be two differentiable functions from Image to Image, and write h = g Image f for the composition. Then the chain rule gives

Image

so we see that the single-variable chain rule takes the form

Image

In brief, the differential of the composition h = g Image f is the composition of the differentials of g and f. It is this elegant formulation of the chain rule that we will generalize in Section 3 to the multivariable case.

Exercises

1.1Let f : Image be a differentiable mapping with f′(t) ≠ 0 for all Image. Let p be a fixed point not on the image curve of f as in Fig. 2.4. If q = f(t0) is the point of the curve closest to p, that is, if Image for all Image, show that the vector p − q is orthogonal to the curve at q. Hint: Differentiate the function Image.

Image

Figure 2.4

1.2(a)Let Image and Image be two differentiable curves, with f′(t) ≠ 0 and g′(t) ≠ 0 for all Image. Suppose the two points p = f(s0) and q = g(t0) are closer than any other pair of points on the two curves. Then prove that the vector p − q is orthogonal to both velocity vectors f′(s0) and g′(t0). Hint: The point (s0, t0) must be a critical point for the function ImageImage defined by .
(b)Apply the result of (a) to find the closest pair of points on the “skew” straight lines in Image defined by f(s) = (s, 2s, −s) and g(t) = (t + 1, t − 2, 2t + 3).

1.3Let Image be a conservative force field on Image, meaning that there exists a continuously differentiable potential function Image such that F(x) = −ImageV(x) for all Image [recall that ImageV = (∂V/∂x1, . . . , ∂V/∂xn)]. Call the curve Image a “quasi-Newtonian particle” if and only if there exist constants m1, m2, . . . , mn, called its “mass components,” such that

Image

for each i = 1, . . . , n. Thus, with respect to the xi-direction, it behaves as though it has mass mi. Define its kinetic energy K(t) and potential energy P(t) at time t by

Image

Now prove that the law of the conservation of energy holds for quasi-Newtonian particles, that is, K + P = constant. Hint: Differentiate K(t) + P(t), using the chain rule in the form P′(t) = ImageV(Image(t)) · Image′(t), which will be verified in Section 3.

1.4(n-body problem) Deduce from Exercise 1.3 the law of the conservation of energy for a system of n particles moving in Image (without colliding) under the influence of their mutual gravitational attractions. You may take n = 2 for brevity, although the method is general. Hint: Denote by m1and m2 the masses of the two particles, and by r1 = (x1, x2, x3) and r2 = (x4, x5, x6) their positions at time t. Let r12 = Imager1r2Image be the distance between them. We then have a quasi-Newtonian particle in Image with mass components m1, m1, m1, m2, m2, m2 and force field F defined by

Image

for Image. If

Image

verify that F = −ImageV. Then apply Exercise 1.3 to conclude that

Image

Remark: In the general case of a system of n particles, the potential function would be

Image

where rij = ImagerjrjImage.

1.5If Image is linear, prove that f′(a) exists for all Image, with dfa = f.

1.6If L1 and L2 are two linear mappings from Image ro Imagen satisfying formula (6), prove that L1 = L2. Hint: Show first that

Image

1.7Let f, Image both be differentiable at a.

(a)Show that d(fg)a = g(a) dfa + f(a) dga.

(b)Show that

Image

1.8Let γ(t) be the position vector of a particle moving with constant acceleration vector γ″(t) = a. Then show that Image, where p0 = γ(0) and v0 = γ′(0). If a = 0, conclude that the particle moves along a straight line through p0 with velocity vector v0 (the law of inertia).

1.9Let γ: ImageImagen be a differentiable curve. Show that Imageγ(t)Image is constant if and only if γ(t) and γ′(t) are orthogonal for all t.

1.10Suppose that a particle moves around a circle in the plane Image2, of radius r centered at 0, with constant speed v. Deduce from the previous exercise that γ(t) and γ″(t) are both orthogonal to γ″(t), so it follows that γ″(t) = k(t)γ(t). Substitute this result into the equation obtained by differentiating γ(t) · γ′(t) = 0 to obtain k = −v2/r2. Thus the acceleration vector always points towards the origin and has constant length v2/r.

1.11Given a particle in Image3 with mass m and position vector γ(t), its angular momentum vector is L(t) = γ(t) × ′(t), and its torque is T(t) = γ(t) × ″(t).

(a)Show that L′(t) = T(t), so the angular momentum is constant if the torque is zero (this is the law of the conservation of angular momentum).

(b)If the particle is moving in a central force field, that is, γ(t) and γ″(t) are always collinear, conclude from (a) that it remains in some fixed plane through the origin.

1.12Consider a particle which moves on a circular helix in Image3 with position vector

Image

(a)Show that the speed of the particle is constant.

(b)Show that its velocity vector makes a constant nonzero angle with the z-axis.

(c)If t1 = 0 and t2 = 2π/Π, notice that γ(t1) = (a, 0, 0) and γ(t2) = (a, 0, 2πb), so the vector γ(t2) − γ(t1) is vertical. Conclude that the equation

Image

cannot hold for any Image. Thus the mean value theorem does not hold for vector-valued functions.