THE MULTIVARIABLE MEAN VALUE THEOREM - Successive Approximations and Implicit Functions - Advanced Calculus of Several Variables

Advanced Calculus of Several Variables (1973)

Part III. Successive Approximations and Implicit Functions

Chapter 2. THE MULTIVARIABLE MEAN VALUE THEOREM

The mean value theorem for real-valued functions states that, if the open set Image contains the line segment L joining the points a and b, and f : UImage is differentiable, then

Image

for some point cL (Theorem II.3.4). We have seen (Exercise II.1.12) that this important result does not generalize to vector-valued functions. However, in many applications of the mean value theorem, all that is actually needed is the numerical estimate

Image

which follows immediately from (1) and the Cauchy-Schwarz inequality (if f is Image so the maximum on the right exists). Fortunately inequality (2) does generalize to the case of Image mappings from Imagen to Imagem, and we will see that this result, the multivariable mean value theorem, plays a key role in the generalization to higher dimensions of the results of Section 1.

Recall from Section I.3 that a norm on the vector space V is a real-valued function x → ImagexImage such that ImagexImage > 0 if x ≠ 0, ImageaxImage = ImageaImage · ImagexImage, and Image for all x,yV and a ∈ Image. Given a norm on V, by the ball of radius r with respect to this norm, centered at aV, is meant the set Image.

Thus far we have used mainly the Euclidean norm

Image

on Imagen. In this section we will find it more convenient to use the “sup norm”

Image

which was introduced in Example 3 of Section I.3. The “unit ball” with respect to the sup norm is the cube Image, which is symmetric with respect to the coordinate planes in Imagen, and has the point (1, 1, . . . , 1) as one of its vertices. The cube Image will be referred to as the “cube of radius r” centered at 0. We will delete the dimensional superscript when it is not needed for clarity.

We will see in Section VI. 1 that any two norms on Imagen are equivalent, in the sense that every ball with respect to one of the norms contains a ball with respect to the other, centered at the same point. Of course this is “obvious” for the Euclidean norm and the sup norm (Fig. 3.5). Consequently it makes no difference which norm we use in the definitions of limits, continuity, etc. (Why ?)

Image

Figure 3.5

We will also need the concept of the norm of a linear mapping L : ImagenImagem. The norm ImageLImage of L is defined by

Image

We will show presently that Image Image is indeed a norm on the vector space Imagemn of all linear mappings from Imagen to Imagem—the only property of a norm that is not obvious is the triangle inequality.

We have seen (in Section I.7) that every linear mapping is continuous. This fact, together with the fact that the function xImagex0Image is clearly continuous on Imagem, implies that the composite xImageL(x)Image0 is continuous on the compact set ∂C1n, so the maximum value ImageLImage exists. Note that, if Image, then Image, so

Image

This is half of the following result, which provides an important interpretation of ImageLImage.

Proposition 2.1 If L : ImagenImagem is a linear mapping, then ImageLImage is the least number M such that Image for all Image.

PROOF It remains only to be shown that, if Image for all Image, then Image. But this follows immediately from the fact that the inequality Image reduces to Image, while ImageLImage = max ImageL(x)Image0 for Image.

Image

In our proof of the mean value theorem we will need the elementary fact that the norm of a component function of the linear mapping L is no greater than the norm ImageLImage of L itself.

Lemma 2.2 If L = (L1, . . . , Lm): ImagenImagem is linear, then Image for each i = 1, . . . , m.

PROOF Let x0 be the point of ∂C1 at which ImageLi(x)Image is maximal. Then

Image

Next we give a formula for actually computing the norm of a given linear mapping. For this we need a particular concept of the “norm” of a matrix. If A = (aij) is an m × n matrix, we define its norm ImageAImage by

Image

Note that, in terms of the “1-norm” defined on Imagen by

Image

ImageAImage is simply the maximum of the 1-norms of the row vectors A1, . . . , Am of A,

Image

To see that this is actually a norm on the vector space Imagemn of all m × n matrices, let us identify Imagemn with Imagemn in the natural way:

Image

In other words, if x1, . . . , xm are the row vectors of the m × n matrix X = (xij), we identify X with the point

Image

With this notation, what we want to show is that

Image

defines a norm on Imagemn. But this follows easily from the fact that Image Image1 is a norm on Imagen (Exercise 2.2). In particular, Image Image satisfies the triangle inequality. A ball with respect to the 1-norm is pictured in Fig. 3.6 (for the case n = 2); a ball with respect to the above norm Image Image on Imagemn is the Cartesian product of m such balls, one in each Imagen factor.

We will now show that the norm of a linear mapping is equal to the norm of its matrix. For example, if L : Image3Image3 is defined by L(x, y, z) = (x − 3z, 2x − y − 2z, x + y), then ImageLImage = max{4, 5, 2} = 5.

Image

Figure 3.6

Theorem 2.3 Let A = (aij) be the matrix of the linear mapping L : ImagenImagem, that is, L(x) = Ax for all Image. Then

Image

PROOF Given Image, the coordinates (y1, . . . , ym) of y = L(x) are defined by

Image

Let ImageykImage be the largest of the absolute values of these coordinates of y. Then

Image

Thus Image for all Image, so it follows from Proposition 2.1 that Image.

To prove that Image, it suffices to exhibit a point Image for which Image. Suppose that the kth row vector Ak = (ak1 . . . akn) is the one whose 1-norm is greatest, so

Image

For each j = 1, . . . , n, define εj = ±1 by akj = εjImageakjImage. If x = (ε1, ε2, . . . , εn), then ImagexImage0 = 1, and

Image

so Image as desired.

Image

Let Φ : ImagemnImagemn be the natural isomorphism from the vector space of all linear mappings ImagenImagem to the vector space of all m × n matrices, Φ(L) being the matrix of Image. Then Theorem 2.3 says simply that the isomorphism Φ is “norm-preserving.” Since we have seen that Image Image on Imagemn satisfies the triangle inequality, it follows easily that the same is true of Image Image on Imagemn. Thus Image Image is indeed a norm on Imagemn.

Henceforth we will identify both the linear mapping space Imagemn and the matrix space Imagemn with Euclidean space Imagemn, by identifying each linear mapping with its matrix, and each m × n matrix with a point of Imagemn (as above). In other words, we can regard either symbol Imagemn as Imagemn as denoting Imagemn with the norm

Image

where Image.

If f : ImagenImagem is a differentiable mapping, then Image, and Image, so we may regard f′ as a mapping form Imagen to Imagemn,

Image

and similarly df as a mapping from Imagen to Imagemn. Recall that f : ImagenImagem is Image at Image if and only if the first partial derivatives of the component functions of f all exist in a neighborhood of a and are continuous at a. The following result is an immediate consequence of this definition.

Proposition 2.4 The differentiable mapping f : ImagenImagem is Image at Image if and only if f′ : ImagenImagemn is continuous at a.

We are finally ready for the mean value theorem.

Theorem 2.5 Let f : UImagem be a Image mapping, where Image is a neighborhood of the line segment L with endpoints a and b. Then

Image

PROOF Let h = b − a, and define the Image curve γ : [0, 1] → Imagem by

Image

If f1, . . . , fm are the component functions of f, then γi(t) = fi(a + th) is the ith component function of γ, and

Image

by the chain rule.

If the maximal (in absolute value) coordinate of f(b) − f(a) is the kth one, then

Image

as desired.

Image

If U is a convex open set (that is, each line segment joining two points of U lies in U), and f : UImagem is a Image mapping such that Image for each xU, then the mean value theorem says that

Image

if Image. Speaking very roughly, this says that f(a + h) is approximately equal to the constant f(a) when ImagehImage0 is very small. The following important corollary to the mean value theorem says (with λ = dfa) that the actual difference Δfa(h) = f(a + h) − f(a) is approximately equal to the linear difference dfa(h) when h is very small.

Corollary 2.6 Let f : UImagem be a Image mapping, where Image is a neighborhood of the line L with endpoints a and a + h. If λ : ImagenImagem is a linear mapping, then

Image

PROOF Apply the mean value theorem to the Image mapping g : UImagem defined by g(x) = f(x) − λ(x), noting that dfx = dfxλ because x = λ (by Example 3 of Section II.2), and that

Image

because λ is linear.

Image

As a typical application of Corollary 2.6, we can prove in the case m = n that, if U contains the cube Cr of radius r centered at 0, and dfx is close (in norm) to the identity mapping I : ImagenImagen for all Image, then the image under fof the cube Cr is contained in a slightly larger cube (Fig. 3.7). This seems natural enough—if df is sufficiently close to the identity, then f should be also, so no point should be moved very far.

Image

Figure 3.7

Corollary 2.7 Let U be an open set in Imagen containing the cube Cr, and f : UImagen a Image mapping such that f(0) = 0 and df0 = I. If

Image

for all Image, then Image

PROOF Applying Corollary 2.6 with a = 0, λ = df0 = I, and Image, we obtain

Image

But Image by the triangle inequality, so it follows that

Image

as desired.

Image

The following corollary is a somewhat deeper application of the mean value theorem. At the same time it illustrates a general phenomenon which is basic to the linear approximation approach to calculus—the fact that simple properties of the differential of a function often reflect deep properties of the function itself. The point here is that the question as to whether a linear mapping is one-to-one, is a rather simple matter, while for an arbitrary given Imagemapping this may be a quite complicated question.

Corollary 2.8 Let f : ImagenImagem be Image at a. If dfa : ImagenImagem is one-to-one, then f itself is one-to-one on some neighborhood of a.

PROOF Let m be the minimum value of Imagedfa(x)Image0 for Image; then m > 0 because dfa is one-to-one [otherwise there would be a point x ≠ 0 with dfa(x) = 0]. Choose a positive number ε < m.

Since f is Image at a, there exists δ > 0 such that

Image

If x and y are any two distinct points of the neighborhood

Image

then an application of Corollary 2.6, with λ = dfa and L the line segment from x to y, yields

Image

The triangle inequality then gives

Image

so

Image

Thus f(x) ≠ f(y) if x ≠ y.

Image

see Corollary 2.8 has the interesting consequence that, if f : ImagenImagen is Image with dfa one-to-one (so f is 1 − 1 in a neighborhood of a), and if f is “slightly perturbed” by means of the addition of a “small” term g : ImagenImagen, then the new mapping h = f + g is still one-to-one in a neighborhood of a. Exercise 2.9 for a precise statement of this result.

In this section we have dealt with the mean value theorem and its corollaries in terms of the sup norms on Imagen and Imagem, and the resulting norm

Image

This will suffice for our purposes. However arbitrary norms Image Imagem on Imagem and Image Imagen on Imagen can be used in the mean value theorem, provided we use the norm

Image

on Imagemn, where Dn is the unit ball in Imagen with respect to the norm Image Imagen. The conclusion of the mean value theorem is then the expected inequality

Image

In Exercises 2.5 and 2.6 we outline an alternative proof of the mean value theorem which establishes it in this generality.

Exercises

2.1Let Image Imagem and Image Imagen be norms on Imagem and Imagen respectively. Prove that

Image

defines a norm on Imagem+n. Similarly prove that Image(x, y)Image1 = ImagexImagem + ImageyImagen defines a norm on Imagem+n.

2.2Show that Image Image, as defined by Eq. (3), is a norm on the space Imagemn of m × n matrices.

2.3Given Image, denote by La the linear function

Image

Consider the norms of La with respect to the sup norm Image Image0 and the 1-norm Image Image1 on Imagen, defined as in the last paragraph of this section. Show that ImageLaImage1 = ImageaImage0 while ImageLaImage0 = ImageaImage1.

2.4Let L : ImagenImagem be a linear mapping with matrix (aij). If we use the 1-norm on Imagen and the sup norm on Imagem, show that the corresponding norm on Imagemn is

Image

that is, the sup norm on Imagemn.

2.5Let γ : [a, b] → Imagem be a Image mapping with Image for all Image, Image Image being an arbitrary norm on Imagem. Prove that

Image

Outline: Given ε > 0, denote by Sε the set of points Image such that

Image

for all Image. Let c = lub Sε. If c < b, then there exists δ > 0 such that

Image

Conclude from this that Image, a contradiction. Therefore c = b, so

Image

for all ε > 0.

2.6Apply the previous exercise to establish the mean value theorem with respect to arbitrary norms on Imagen and Imagem. In particular, given f : UImagem where U is a neighborhood in Imagen of the line segment L from a to a + h, apply Exercise 2.5 with γ(t) = f(a + th).

2.7(a) Show that the linear mapping T : ImagenImagem is one-to-one if and only if Image is positive.

(b)Conclude that the linear mapping T : ImagenImagem is one-to-one if and only if there exists a > 0 such that Image for all Image.

2.8Let T: ImagenImagem be a one-to-one linear mapping with Image for all Image, where a > 0. If Image, show that Image for all Image, so S is also one-to-one. Thus the set of all one-to-one linear mappings ImagenImagem forms an open subset of ImagemnImagemn.

2.9Apply Corollary 2.8 and the preceding exercise to prove the following. Let f : ImagenImagen be a Image mapping such that dfa : ImagenImagen is one-to-one, so that f is one-to-one in a neighborhood of a. Then there exists ε > 0 such that if g : ImagenImagen is a Image mapping with g(a) = 0 and ImagedgaImage < ε, then the mapping h : ImagenImagen, defined by h(x) = f(x) + g(x), is also one-to-one in a neighborhood of a.