Two-Dimensional Calculus (2011)
Chapter 2. Differentiation
10. Quadratic forms
The study of quadratic forms affords a striking illustration of the usefulness of the method of Lagrange multipliers. Since quadratic forms play a basic role in many parts of mathematics, including the study of maxima and minima for functions of several variables, it is well worth deriving some of their basic properties.
Definition 10.1 A quadratic form is a homogeneous quadratic polynomial.
Thus, a quadratic form q(x, y) in two variables has the form
where a, b, and c are constants. As we remarked in the case of general homogeneous functions, it is sufficient to know the values of the function on the unit circle x2 + y2 = 1 in order to know it everywhere. Specifically, we may write any point (x, y) as (r cos α, r sin α), and we have from Eq. (10.1) that
We therefore pose the following problem:
“Find the maximum and minimum values of q(x, y) on the unit circle x2 + y2 = 1.”
We note first that every point (x, y) on the unit circle may be written as x = cos t, y = sin t, and therefore if x2 + y2 = 1, we have
which is a continuous function of t for 0 ≤ t ≤ 2π, and hence assumes a maximum and minimum.6 We may therefore apply the method of Lagrange multipliers. We set
and state the problem as follows:
“Find the maximum and minimum of q(x, y) if g(x, y) = 1.”
We first check the gradient of g,
Thus ∇g = 0 only at the origin, and in particular ∇g ≠ 0 on the level curve g(x, y) = 1. We conclude that if (x0, y0) is a point on x2 + y2 = 1 where q(x, y) has a maximum or minimum, then there exists a value λ for which
But
and comparing this with Eq. (10.3), we may write Eq. (10.4) in the form
These equations must be solved simultaneously with
There is a useful trick which may be used whenever the Lagrange multiplier method is applied to homogeneous functions (see Ex. (10.26)). We multiply the first equation of Eqs. (10.4a) by x0, the second by y0, and add; this yields
or, by virtue of Eq. (10.6),
This result may be described as follows: The value of the Lagrange multiplier λ is in this case precisely the value of the function q(x, y) that we are seeking.
We may now easily determine the value of λ from Eqs. (10.4a). To do so, we rewrite these equations by transposing the right-hand sides:
These are two linear homogeneous equations in the unknowns x0, y0 which by Eq. (10.6) have a nontrivial solution (x0, y0) ≠ (0, 0). This can only happen if the determinant of the system is zero; that is,
or
But the value of λ at the maximum point as well as the value at the minimum point must satisfy this equation. We have therefore proved the following theorem.
Theorem 10.1 The roots of Eq. (10.8) represent the maximum and minimum values of ax2 + 2bxy + cy2 on x2 + y2 = 1.
Example 10.1
Here
Equation (10.8) becomes
Hence
on x2 + y2 = 1.
Example 10.2
In this case
and Eq. (10.8) becomes
Thus
on x2 + y2 = 1.
Example 10.3
Here
and Eq. (10.8) becomes
Thus
on x2 + y2 = 1.
There are several remarkable features that distinguish the solution provided by Th. 10.1 from that of a typical maximum-minimum problem in calculus. We note the following:
1. The standard procedure for finding the maximum of a function requires us first to find the point at which the maximum occurs, and then to evaluate the function at that point. Using Th. 10.1, we obtain the maximum value directly. If (as is usually not the case) we wish to know the point at which the maximum is assumed, we may return to Eqs. (10.4b), substitute in the value of λ, and solve. Thus, in Example 10.1, the maximum value is λ = 2, and since a = b = c = 1, Eqs. (10.4b) become − x0 + y0 = 0, x0 − y0 = 0; these equations are equivalent: y0 = x0, and since + = 1, we have 2 = 1, so that x0 = ± 1/ . The maximum of x2 + 2xy + y2 occurs therefore at (1/ , 1/ ) and (−1/ , −1/ ).
2. Calculus is used to reduce this maximum-minimum problem to the solution of an algebraic equation. For each case the solution is then obtained by simple algebra.
3. In a large class of problems it is not even necessary to find the precise maximum and minimum values, but merely to know if they are positive or negative. Again a simple algebraic procedure provides the answer.
The following observations illustrate point 3 above.
Let λ1 and λ2 be the maximum and minimum, respectively, of q(x, y) on x2 + y2 = 1. Thus,
for x2 + y2 = 1. Since λ1 and λ2 are the roots of Eq. (10.8), we have
We note next that the following statements are equivalent.
Conditions I
a. q(x, y) > 0, for all (x, y) ≠ (0, 0)
b. q(x, y) > 0, for x2 + y2 = 1
c. λ1 > 0, λ2 > 0
d. λ1λ2 > 0, λ1 + λ2 > 0
e. ac − b2 > 0, a + c > 0
f. ac − b2 > 0, a > 0
That statements a and b are equivalent follows from Eq. (10.2), since r2 > 0 unless (x, y) = (0, 0). That statements b and c are equivalent follows from Eq. (10.9). The equivalence of statements c and d is trivial, and so is that of statements d and e, using Eq. (10.10). Finally, if ac − b2 > 0, then ac > b2 ≥ 0, so that a and c have the same sign. Thus,
Definition 10.2 A quadratic form is called positive definite if it satisfies condition I.a.
The equivalence of the various statements listed under Conditions I allow us to check at a glance whether a given quadratic form is positive definite or not. Generally speaking, condition I.f is the quickest to verify.
Example 10.4
Here
Hence, q(x, y) is not positive definite.
Example 10.5
Here
Also a > 0. Hence, q(x, y) is positive definite.
Example 10.6
Here
Thus, q(x, y) is not positive definite.
In precisely the same way as outlined above, one may prove that the following sets of conditions are equivalent.
Conditions II
a. q(x, y) < 0, for all (x, y) ≠ (0, 0)
b. q(x, y) < 0, for x2 + y2 = 1
c. λ1 < 0, λ2 < 0
d. λ1λ2 > 0, λ1 + λ2 < 0
e. ac − b2 > 0, a + c < 0
f. ac − b2 > 0, a < 0
Definition 10.3 A quadratic form satisfying condition II.a is called negative definite.
Conditions III
a. q(x, y) ≥ 0, for all (x, y)
q(x0, y0) = 0, for some (x0, y0) ≠ (0, 0)
b. q(x, y) ≥ 0, for x2 + y2 = 1
q(x0, y0) = 0, for some (x0, y0),
c. λ1 ≥ 0, λ2 = 0
d. λ1λ2 = 0, λ1 + λ2 ≥ 0
e. ac − b2 = 0, a + c ≥ 0
Conditions IV
a. q(x, y) ≤ 0, for all (x, y)
q(x0, y0) = 0, for some (x0, y0) ≠ (0, 0)
b. q(x, y) ≤ 0, for x2 + y2 = 1
q(x0, y0) = 0, for some (x0, y0),
c. λ1 = 0, λ2 ≤ 0
d. λ1λ2 = 0, λ1 + λ2 ≤ 0
e. ac − b2 = 0, a + c ≤ 0
Definition 10.4 A quadratic form is called positive semidefinite if it satisfies condition III.a and negative semidefinite if it satisfies condition IV.a.
Finally, the following conditions are equivalent.
Conditions V
a. q(x, y) takes on both positive and negative values
b. q(x, y) takes on both positive and negative values on x2 + y2 = 1
c. λ1 > 0, λ2 < 0
d. λ1λ2 < 0
e. ac − b2 < 0
We may summarize the situation as follows.
Case 1. ac − b2 < 0
q(x, y) changes sign
Case 2. ac − b2 > 0
q(x, y) does not change sign
q(x, y) is always positive if a > 0, and always negative if a < 0
q(x, y) is a definite form
Case 3. ac − b2 = 0
q(x, y) does not change sign
q(x, y) does take on the value zero
q(x, y) is a semidefinite form
Rotation of Coordinates
The above properties of quadratic forms are all that we need in the remainder of this chapter. However, we can obtain considerably more information, as well as new insight into the facts already noted, if we ask the simple question
“what happens to a quadratic form under rotation of coordinates?”
Let us consider, then, new coordinates (x, y) obtained by a rotation through an angle θ about the origin (Fig. 10.1). The relation between the old and the new coordinates of a point is given by
If we substitute these expressions for x and y, we obtain
where A, B, C are new constants that depend on a, b, c, and θ. We are not interested in the precise expressions relating A, B, C to a, b, c, but only in the fact that what we obtain is again a quadratic form in the new coordinates. Further, it follows directly from Eq. (10.11) that
FIGURE 10.1 Rotation of axes
so that
expressing the fact that Q(X, Y) and q(x, y) represent exactly the same function, but in different coordinates, so that their maximum and minimum on the unit circle (x2 + y2 = 1 or X2 + Y2 = 1) are the same. We may therefore apply all our previous results to the form Q(x, y), and in particular, from Eq. (10.10), we have
Equation (10.15) is usually expressed by saying that ac − b2 and a + c are invariants of a quadratic form under rotation of coordinates.
We now make a special rotation that simplifies the expression for Q(x, y). We suppose that q(x, y) has its maximum on x2 + y2 = 1 at (x0, y0) and we choose the positive X axis to pass through this point. This means that in the new coordinates, the point x = x0, y = y0 corresponds to X = 1, Y = 0. Thus,
by Eq. (10.12). But substituting λ1 = A in the second of Eqs. (10.15), λ1 + λ2 = A + C, yields λ2 = C. Then λ1λ2 = AC, and from the first of Eqs. (10.15), λ1λ2 = AC − B2, we conclude that B2 = 0, or B = 0. We may summarize as follows.
Theorem 10.2 Reduction to Diagonal Form Every quadratic form q(x, y) may be transformed into the normal form
by a rotation of coordinates. The coefficients λ1, λ2 represent the maximum and minimum values of the quadratic form on the unit circle.
Theorem 10.2 has many important consequences. We list the following.
Corollary 1 A quadratic form q(x, y) is constant on x2 + y2 = 1 ⇔ q(x, y) = c(x2 + y2); that is, a = c, b = 0.
PROOF. Clearly c(x2 + y2) is constant for x2 + y2 = 1. Conversely, suppose q(x, y) ≡ c on x2 + y2 = 1. Then λ1 = λ2 = c, and after rotating coordinates to obtain Eq. (10.16), we have
using Eq. (10.13).
Corollary 2 If q(x, y) is not constant on x2 + y2 = 1, then the points where q(x, y) assumes its maximum and minimum lie on a pair of perpendicular lines through the origin.
PROOF. If q(x, y) is not constant, then λ1 > λ2, and
Thus the lines through the maximum and minimum points are the X and Y axes.
Corollary 3 If q(x, y) is positive definite, then the equation q(x, y) = 1 defines an ellipse (or circle). If ac − b2 < 0, then q(x, y) = 1 defines a hyperbola.
PROOF. If q(x, y) is positive definite, then λ1 ≥ λ2 > 0, and the equation λ1X2 + λ2Y2 = 1 defines an ellipse if λ1 > λ2, a circle if λ1 = λ2. If ac − b2 < 0, then λ1λ2 < 0, and we have a hyperbola.
We examine somewhat more closely the positive definite case. If
is positive definite, then by rotating coordinates, we have
where
are the semiminor and semimajor axes, respectively, of the ellipse (Fig. 10.2). Since
we see the geometric significance of Eq. (10.15). These quantities are invariant because they may be expressed in terms of the fundamental dimensions of the ellipse, which have nothing to do with coordinates. In particular, the area inside the ellipse defined by Eq. (10.17) is equal to πmn. We have therefore the following consequence of Eq. (10.19).
Corollary 4 If the equation ax2 + 2bxy + cy2 = 1 represents an ellipse, then the area inside the ellipse is equal to π/(ac − b2)1/2.
FIGURE 10.2 Geometric significance of the quantities λ1, λ2 associated with a positive definite quadratic form
We conclude this section with several remarks directed toward those who have some familiarity with matrix theory.
With the quadratic form ax2 + 2bxy + cy2, one associates the symmetric matrix
Its determinant is ac − b2, and its trace is a + c. The roots λ1 λ2 of Eq. (10.8) are called the characteristic values or eigenvalues of the matrix M. If we substitute these values in Eq. (10.4a) and solve for x0, y0, the resulting vectors x0, y0 are called the characteristic vectors or eigenvectors of M. We may summarize the results of this section as follows.
To each symmetric matrix M one may associate a quadratic form q(x, y). (Namely, to the matrix (10.20) one associates the form q(x, y) = ax2 + 2bxy + cy2.) The matrix M has real characteristic values, which are the maximum and minimum of q(x, y) on x2 + y2 = 1. The characteristic vectors of M are orthogonal. Under a rotation of coordinates, the same quadratic form is associated with a new symmetric matrix , which has the same determinant and trace as M. One may always choose coordinates so that the matrix is in diagonal form:
To do this the new coordinate axes should be chosen in the direction of the characteristic vectors of M. The elements λ1 λ2 on the diagonal of are precisely the characteristic values of M.
All of these results for 2 × 2 matrices extend to symmetric matrices of any size. They are proved for the general case in books on linear algebra. The geometric interpretation of these facts in the two-dimensional case can be of great value in understanding the general situation in higher dimensions.
Exercises
10.1 Using Th. 10.1, find the maximum and minimum on x2 + y2 = 1 of each of the following quadratic forms.
a. 2x2 + 2xy + 2y2
b. 2x2 − 2xy + 2y2
c. x2 − 4xy + y2
d. x2 − 4xy + 4y2
e. − 3x2 + 4xy − 3y2
f. 3x2 − 8xy + 3y2
g. − x2 + 2xy + y2
h. −4x2 − 12xy − 9y2
i. −2x2 − 2xy + y2
j. x2 − xy + y2
k. x2 − 3xy + 4y2
l. 5x2 − 3y2
10.2 Verify Eq. (10.10) directly for each part of Ex. 10.1.
10.3 Using Eqs. (10.4a) (or (10.4b)), find the points at which the maximum and minimum occur in Ex. 10.1a, b, c, d.
10.4 For each part of Ex. 10.3 show that:
a. The maximum occurs at a pair of points lying on a straight line through the origin.
b. The minimum occurs at a pair of points lying on a straight line through the origin.
c. These two lines are perpendicular.
10.5 Classify each of the following quadratic forms according to the five categories listed in the text: I−positive definite; II−negative definite; III− positive semidefinite; IV−negative semidefinite; V−takes on both positive and negative values.
a. x2 − 2xy + y2
b. x2 − 2xy − y2
c. 2x2 − 2xy + y2
d. − x2 + 3xy − 3y2
e. − x2 + 3xy − 2y2
f. − x2 + 4xy − 4y2
10.6 Let λ1 be the maximum and λ2 the minimum of the quadratic form q(x, y) = ax2 + 2bxy + cy2 on x2 + y2 = 1. Find the explicit expressions for λ1 and λ2 in terms of a, b, c. (Hint: use Eq. (10.8).) Use these expressions to verify your answers to Ex. 10.1a, b, c, d.
10.7 Using Ex. 10.6, show that λ1 = λ2 (that is, q(x, y) is constant on x2 + y2 = 1) if and only if a = c, b = 0 (that is, q(x, y) = a(x2 + y2)).
10.8 By virtue of the first of Eqs. (10.4b), the point on x2 + y2 = 1 where q(x, y) attains its maximum λ1 must lie on the straight line (a − λ1)x + by = 0, while the point where it attains its minimum λ2 must lie on the straight line (a − λ2)x + by = 0.
a. Write down the condition for these two lines to be perpendicular.
b. Verify that this condition is satisfied. (Hint: use Eq. (10.10).)
10.9 Let q(x, y) = (αx + βy)2.
a. Find the coefficients a, b, c in terms of α, β.
b. Show that q(x, y) is positive semidefinite.
10.10 Let q(x, y) = ax2 + 2bxy + cy2 be positive semidefinite.
a. Show that the maximum of q(x, y) on x2 + y2 = 1 is λ1 = a + c.
b. Show that a ≥ 0 and c ≥ 0. (Note that a = q(1, 0), c = q(0, 1).)
c. Show that q(x, y) = (αx ± βy)2, where α = a1/2, β = b1/2.
d. What are the level lines of q(x, y)?
e. Sketch the surface z = q(x, y).
10.11 Let q(x, y) be negative semidefinite. What would be the analogous statements for Ex. 10.10a, b, c? Answer Ex. l0.l0d, e in this case.
10.12 Prove the equivalence of the statements listed under Conditions V in the text, for quadratic forms that take on both positive and negative values.
10.13 Important use was made in this section of the following fact: if a pair of homogeneous linear equations
has a simultaneous solution (x0, y0) different from (0, 0), then the determinant αδ − βγ must be zero. Explain carefully why this is true.
10.14 If q(x, y) = ax2 + 2bxy + cy2, and if Q(X, Y) = AX2 + 2BXY + CY2 is the corresponding form obtained by a rotation of coordinates, Eq. (10.11) , find the explicit expressions for A, B, and C in terms of a, b, c, and θ.
10.15 Using the expressions in Ex. 10.14, show directly that A + C = a + c.
*10.16 Using the expressions in Ex. 10.14, show directly that AC − B2 = ac − b2.
10.17 Use the notation and results of Ex. 10.14.
a. Show that B = (c − a) sin 2θ + b cos 2θ.
b. Show that the angle θ through which coordinates must be rotated in order to reduce q(x, y) to the normal form of Eq. (10.16) satisfies
c. Show that if ax2 + 2bxy + cy2 = 1 defines an ellipse, then the angle between its major axis and the x axis is the angle θ of part b.
d. Show that for the ellipse of part c, if a = c, then the major and minor axes lie along the diagonals between the x and y axes.
10.18 Note that in order to write down the normal form Eq. (10.16) of a quadratic form, obtained by a rotation of coordinates, it is not necessary to carry out the rotation of coordinates, or even to know the angle through which they are rotated. The final result can be written down directly as soon as the quantities λ1 and λ2 are known. Use this to write down the normal form of each of the following quadratic forms.
a. x2 + 4xy + y2
b. x2 + 4xy + 4y2
c. x2 + xy + y2
d. − 3x2 + 4xy − 3y2
e. xy
10.19 Describe the curve defined by each of the following equations, and in case it is an ellipse give the area enclosed by it.
a. x2 + 9y2 = 1
b. 2x2 = 1
c. 2x2 − 5xy + 3y2 = 1
d. 4x2 + 4xy + 5y2 = 1
e. 2x2 − 4xy + 2y2 = 1
f. − 3x2 + 7xy − 5y2 = 1
10.20 Describe all the level curves of q(x, y) = ax2 + 2bxy + cy2, making sure to consider q(x, y) = k for k positive, negative, and zero, where :
a. q(x, y) is positive definite
b. q(x, y) is negative definite
c. q(x, y) is positive semidefinite
d. q(x, y) is negative semidefinite
e. ac − b2 < 0
(Hint: it may be easiest to consider a rotation of coordinates that transforms q(x, y) into normal form.)
10.21 Sketch the surface z = q(x, y) for each of the parts of Ex. 10.19.
10.22 Using the methods developed in this section, find the maximum and minimum of x2 + y2 under the condition 2x2 + 2xy + 2y2 = 1.
10.23 a.What is the relation between the answers to Ex. 10.22 and to Ex. 10.1a?
b.What is the geometric interpretation of Ex. 10.22?
*10.24 Let q(x, y) be a positive definite quadratic form, and let λ1, λ2 be the maximum and minimum of q(x, y) on x2 + y2 = 1. Show that
a. q(x, y)/(x2 + y2) is constant on every ray through the origin.
b.
c.
d.
e. is the length of the semiminor axis of the ellipse q(x, y) = 1.
10.25 a. Show that if q(x, y) is a positive definite quadratic form, then the gradient of q(x, y) vanishes only at the origin.
b. Let ex2 + 2fxy + gy2 be a positive definite quadratic form. Using the methods of this section, show that the maximum and minimum of ax2 + 2bxy + cy2 under the condition ex2 + 2fxy + gy2 = 1 are the roots of the equation
10.26 Let f(x, y) and g(x, y) be homogeneous functions of degree k ≠ 0. Suppose that f(x, y) has a maximum or minimum under the condition g(x, y) = 1 at a point (x0, y0) where ∇g(x0, y0) ≠ 0. Then by the Lagrange multiplier method there exists a λ such that
Show that λ = f(x0, y0). (Hint: use Euler’s theorem for homogeneous functions.)
10.27 Find the minimum of cos2 t − 2 cos t sin t + 3 sin2 t for 0 ≤ t ≤ 2π.
10.28 Show that every quadratic polynomial ax2 + 2bxy + cy2 + 2dx + 2ey + f can be reduced to the form AX2 + BY2 + C by a rotation and translation of coordinates in the x, y plane. (Hint: first remove the xy term by a rotation, and then complete the squares.)
10.29 The following approach to quadratic forms yields their principal properties without the use of calculus.
a. Show that every quadratic form
can be expressed in either of the two forms
or
using polar coordinates x = r cos θ, y = r sin θ. Find the constants A, B, C, D explicitly in terms of a, b, c.
b. Using part a, find the maximum and minimum of q(x, y) on x2 + y2 = 1.
c. Show that if q(x, y) takes on its maximum on x2 + y2 = 1 for θ = θ0, then it takes on its minimum for θ = θ0 + π