INNER PRODUCTS AND ORTHOGONALITY - Euclidean Space and Linear Mappings - Advanced Calculus of Several Variables

Advanced Calculus of Several Variables (1973)

Part I. Euclidean Space and Linear Mappings

Chapter 3. INNER PRODUCTS AND ORTHOGONALITY

In order to obtain the full geometric structure of Imagen (including the concepts of distance, angles, and orthogonality), we must supply Imagen with an inner product. An inner (scalar) product on the vector space V is a function V × VImage, which associates with each pair (x, y) of vectors in V a real number Imagex, yImage, and satisfies the following three conditions:

Image

The third of these conditions is linearity in the first variable; symmetry then gives linearity in the second variable also. Thus an inner product on V is simply a positive, symmetric, bilinear function on V × V. Note that SP3 implies that Image0, 0Image = 0 (see Exercise 3.1).

The usual inner product on Imagen is denoted by x · y and is defined by

Image

where x = (x1, . . . , xn), y = (y1, . . . , yn). It should be clear that this definition satisfies conditions (SP1, SP2, SP3 above. There are many inner products on Imagen see Example 2 below), but we shall use only the usual one.

Example 1Denote by Image the vector space of all continuous functions on the interval [a, b], and define

Image

for any pair of functions Image It is obvious that this definition satisfies conditions SP2 and SP3. It also satisfies SP1, because if f(t0) ≠ 0, then by continuity (f(t))2 > 0 for all t in some neighborhood of t0, so

Image

Therefore we have an inner product on Image.

Example 2Let a, b, c be real numbers with a > 0, ac − b2 > 0, so that the quadratic form q(x) = ax12 + 2bx1x2 + cx22 is positive-definite (see Section II.4). Then Imagex, yImage = ax1y1 + bx1y2 + bx2y1 + cx2y2 defines an inner product on Image2 (why?). With a = c = 1, b = 0 we obtain the usual inner product on Image2.

An inner product on the vector space V yields a notion of the length or “size” of a vector Image, called its norm ImagexImage. In general, a norm on the vector space V is a real-valued function xImagexImage on V satisfying the following conditions:

Image

for all Image and Image. Note that N2 implies that Image0Image = 0.

The norm associated with the inner product Image , Image on V is defined by

Image

It is clear that SP1–SP3 and this definition imply conditions N1 and N2, but the triangle inequality is not so obvious; it will be verified below.

The most commonly used norm on Imagen is the Euclidean norm

Image

which comes in the above way from the usual inner product on Imagen. Other norms on Imagen, not necessarily associated with inner products, are occasionally employed, but henceforth ImagexImage will denote the Euclidean norm unless otherwise specified.

Example 3ImagexImage = max {Imagex1Image, . . . , ImagexnImage}, the maximum of the absolute values of the coordinates of x, defines a norm on Imagen (see Exercise 3.2).

Example 4ImagexImage1 = Imagex1Image + Imagex2Image + · · · + ImagexnImage defines still another norm on Imagen (again see Exercise 3.2).

A norm on V provides a definition of the distance d(x, y) between any two points x and y of V:

Image

Note that a distance function d defined in this way satisfies the following three conditions:

Image

for any three points x, y, z. Conditions D1 and D2 follow immediately from N1 and N2, respectively, while

Image

by N3. Figure 1.1 indicates why N3 (or D3) is referred to as the triangle inequality.

Image

Figure 1.1

The distance function that comes in this way from the Euclidean norm is the familiar Euclidean distance function

Image

Thus far we have seen that an inner product on the vector space V yields a norm on V, which in turn yields a distance function on V, except that we have not yet verified that the norm associated with a given inner product does indeed satisfy the triangle inequality. The triangle inequality will follow from the Cauchy–Schwarz inequality of the following theorem.

Theorem 3.1If Image , Image is an inner product on a vector space V, then

Image

for all Image [where the norm is the one defined by (2)].

PROOFThe inequality is trivial if either x or y is zero, so assume neither is. If u = x/ImagexImage and v = y/ImageyImage, then ImageuImage = ImagevImage = 1. Hence

Image

So Image that is Image or

Image

Replacing x by −x, we obtain

Image

also, so the inequality follows.

Image

The Cauchy–Schwarz inequality is of fundamental importance. With the usual inner product in Imagen, it takes the form

Image

while in Image, with the inner product of Example 1, it becomes

Image

PROOFOF THE TRIANGLE INEQUALITY Given Image note that

Image

which implies that Image

Image

Notice that, if Imagex, yImage = 0, in which case x and y are perpendicular (see the definition below), then the second equality in the above proof gives

Image

This is the famous theorem associated with the name of Pythagoras (Fig. 1.2).

Image

Figure 1.2

Recalling the formula x · y = ImagexImage ImageyImage cos θ for the usual inner product in Image2, we are motivated to define the angle Image(x, y) between the vectors Image by

Image

Notice that this makes sense because Image by the Cauchy–Schwarz inequality. In particular we say that x and y are orthogonal (or perpendicular) if and only if x · y = 0, because then Image(x, y) = arccos π/2 = 0.

A set of nonzero vectors v1, v2, . . . in V is said to be an orthogonal set if

Image

whenever i ≠ j. If in addition each vi is a unit vector, Imagevi, viImage = 1, then the set is said to be orthonormal.

Example 5The standard basis vectors e1, . . . , en form an orthonormal set in Imagen.

Example 6The (infinite) set of functions

Image

is orthogonal in Image (see Example 1 and Exercise 3.11). This fact is the basis for the theory of Fourier series.

The most important property of orthogonal sets is given by the following result.

Theorem 3.2Every finite orthogonal set of nonzero vectors is linearly independent.

PROOFSuppose that

Image

Taking the inner product with vi, we obtain

Image

because Imagevi, viImage = 0 for i ≠ j if the vectors v1, . . . , vk are orthogonal. But Imagevi, viImage ≠ 0, so ai = 0. Thus (3) implies a1 = · · · = ak = 0, so the orthogonal vectors v1, . . . , vk are linearly independent.

Image

We now describe the important Gram–Schmidt orthogonalization process for constructing orthogonal bases. It is motivated by the following elementary construction. Given two linearly independent vectors v and w1, we want to find a nonzero vector w2 that lies in the subspace spanned by v and w1, and is orthogonal to w1. Figure 1.3 suggests that such a vector w2 can be obtained by subtracting from v an appropriate multiple cw1 of w1. To determine c,

Image

Figure 1.3

we simply solve the equation Imagew1, vcw1Image = 0 for c = Imagev, w1Image/Imagew1, w1Image. The desired vector is therefore

Image

obtained by subtracting from v the “component of v parallel to w1.” We immediately verify that Imagew2, w1Image = 0, while w2 ≠ 0 because v and w1 are linearly independent.

Theorem 3.3If V is a finite-dimensional vector space with an inner product, then V has an orthogonal basis.

In particular, every subspace of Imagen has an orthogonal basis.

PROOFWe start with an arbitrary basis v1, . . . , vn for V. Let w1 = v1. Then, by the preceding construction, the nonzero vector

Image

is orthogonal to w1 and lies in the subspace generated by v1 and v2.

Suppose inductively that we have found an orthogonal basis w1, . . . , wk for the subspace of V that is generated by v1, . . . , vk. The idea is then to obtain wk+1 by subtracting from vk+1 its components parallel to each of the vectors w1, . . . , wk. That is, define

Image

where ci = Imagevk+1, wiImage/Imagewi, wiImage. Then Imagewk+1, wiImage = Imagevk+1, wiImageciImagewi, wiImage = 0 for Image, and wk+1 ≠ 0, because otherwise vk+1 would be a linear combination of the vectors w1, . . . , wk, and therefore of the vectors v1, . . . , vk. It follows that the vectors w1, . . . , wk+1 form an orthogonal basis for the subspace of V that is generated by v1, . . . , vk+1.

After a finite number of such steps we obtain the desired orthogonal basis w1, . . . , wn for V.

Image

It is the method of proof of Theorem 3.3 that is known as the Gram–Schmidt orthogonalization process, summarized by the equations

Image

defining the orthogonal basis w1, . . . , wn in terms of the original basis v1, . . . , vn.

Example 7To find an orthogonal basis for the subspace V of Image4 spanned by the vectors v1 = (1, 1, 0, 0), v2 = (1, 0, 1, 0), v3 = (0, 1, 0, 1), we write

Image

Example 8Let Image denote the vector space of polynomials in x, with inner product defined by

Image

By applying the Gram–Schmidt orthogonalization process to the linearly independent elements 1, x, x2, . . . , xn, . . . , one obtains an infinite sequence Image, the first five elements of which are Image (see Exercise 3.12). Upon multiplying the polynomials {pn(x)} by appropriate constants, one obtains the famous Legendre polynomials Image etc.

One reason for the importance of orthogonal bases is the ease with which a vector Image can be expressed as a linear combination of orthogonal basis vectors w1, . . . , wn for V. Writing

Image

and taking the inner product with wi, we immediately obtain

Image

so

Image

This is especially simple if w1, . . . , wn is an orthonormal basis for V:

Image

Of course orthonormal basis vectors are easily obtained from orthogonal ones, simply by dividing by their lengths. In this case the coefficient v · wi of wi in (5) is sometimes called the Fourier coefficient of v with respect to wi. This terminology is motivated by an analogy with Fourier series. The orthonormal functions in Image corresponding to the orthogonal functions of Example 6 are

Image

Writing

Image

one defines the Fourier coefficients of Image by

Image

and

Image

It can then be established, under appropriate conditions on f, that the infinite series

Image

converges to f(x). This infinite series may be regarded as an infinite-dimensional analog of (5).

Given a subspace V of Imagen, denote by VImage the set of all those vectors in Imagen, each of which is orthogonal to every vector in V. Then it is easy to show that VImage is a subspace of Imagen, called the orthogonal complement of V (Exercise 3.3). The significant fact about this situation is that the dimensions add up as they should.

Theorem 3.4If V is a subspace of Imagen, then

Image

PROOF By Theorem 3.3, there exists an orthonormal basis v1, . . . , vr for V, and an orthonormal basis w1, . . . , ws for VImage. Then the vectors v1, . . . , vr, w1, . . . , ws are orthornormal, and therefore linearly independent. So in order to conclude from Theorem 2.5 that r + s = n as desired, it suffices to show that these vectors generate Imagen. Given Image, define

Image

Then y · vi = x · vi − (x · vi)(vi · vi) = 0 for each i = 1, . . . , r. Since y is orthogonal to each element of a basis for V, it follows easily that Image (Exercise 3.4). Therefore Eq. (5) above gives

Image

This and (7) then yield

Image

so the vectors v1, . . . , vr, w1, . . . , ws constitute a basis for Imagen.

Image

Example 9Consider the system

Image

of Image homogeneous linear equations in x1, . . . , xn. If ai = (ai1, . . . , ain), i = 1, . . . , k, then these equations can be rewritten as

Image

Therefore the set S of all solutions of (8) is simply the set of all those vectors Image that are orthogonal to the vectors a1, . . . , ak. If V is the subspace of Imagen generated by a1, . . . , ak, it follows that S = VImage (Exercise 3.4). If the vectors a1, . . . , ak are linearly independent, we can then conclude from Theorem 3.4 that dim S = n − k.

Exercises

3.1Conclude from condition SP3 that Image0, 0Image = 0.

3.2Verify that the functions defined in Examples 3 and 4 are norms on Imagen.

3.3If V is a subspace of Imagen, prove that VImage is also a subspace.

3.4If the vectors a1, . . . , ak generate the subspace V of Imagen, and Image is orthogonal to each of these vectors, show that Image.

3.5Verify the “polarization identity” Image

3.6Let a1, a2, . . . , an be an orthonormal basis for Imagen. If x = s1a1 + · · · + snan and y = t1a1 + · · · + tnan, show that x · y = s1t1 + · · · + sn tn. That is, in computing x · y, one may replace the coordinates of x and y by their components relative to any orthonormal basis for Imagen.

3.7Orthogonalize the basis (1, 0, 0, 1), (−1, 0, 2, 1), (0, 1, 2, 0), (0, 0, −1, 1) in Image4.

3.8Orthogonalize the basis

Image

in Imagen.

3.9Find an orthogonal basis for the 3-dimensional subspace V of Image4 that consists of all solutions of the equation x1 + x2 + x3x4 = 0. Hint: Orthogonalize the vectors v1 = (1, 0, 0, 1), v2 = (0, 1, 0, 1), v3 = (0, 0, 1, 1).

3.10Consider the two equations

Image

Let V be the set of all solutions of (*) and W the set of all solutions of both equations. Then W is a 2-dimensional subspace of the 3-dimensional subspace V of Image4 (why?).

(a) Solve (*) and (**) to find a basis v1, v2 for W.

(b) Find by inspection a vector v3 which is in V but not in W. Why is v1, v2, v3 then a basis for V?

(c) Orthogonalize v1, v2, v3 to obtain an orthogonal basis w1, w2, w3 for V, with w1 and w2 in W.

(d) Normalize w1, w2, w3 to obtain an orthonormal basis u1, u2, u3 for V. Express v = (11, 3, 6, −11) as a linear combination of u1, u2, u3.

(e) Find vectors Image and Image such that v = x + y.

3.11Show that the functions

Image

are orthogonal in the inner product space Image of Example 1.

3.12Orthogonalize in Image the functions 1, x, x2, x3, x4 to obtain the polynomials p0(x), . . . , p4(x) listed in Example 8.