Advanced Calculus of Several Variables (1973)
Part II. Multivariable Differential Calculus
Chapter 8. THE CLASSIFICATION OF CRITICAL POINTS
Let f be a real-valued function of class
on an open set
, and let
be a critical point of f. In order to apply Theorem 7.5 to determine the nature of the critical point a, we need to decide the character of the quadratic form
defined by

where the n × n matrix A = (aij) is defined by
, and

Note that the matrix A is symmetric, that is aij = aji, since Di Dj = Dj Di.
We have seen that, in order to determine whether q is positive-definite, negative-definite, or nondefinite, it suffices to determine the maximum and minimum values of q on the unit sphere Sn−1. The following result rephrases this problem in terms of the linear mapping
defined by
![]()
We shall refer to the quadratic form q, the symmetric matrix A, and the linear mapping L as being “associated” with one another.
Theorem 8.1 If the quadratic form q attains its maximum or minimum value on Sn−1 at the point
, then there exists a real number λ such that
![]()
where L is the linear mapping associated with q.
Such a nonzero vector v, which upon application of the linear mapping
is simply multiplied by some real number λ, L(v) = λv, is called an eigenvector of L, with associated eigenvalue λ.
PROOF We apply the Lagrange multiplier method. According to Theorem 5.5, there exists a number
such that
![]()
where
defines the sphere Sn−1.
But Dk g(x) = 2xk, and

where the row vector Ak is the kth row of A, that is, Ak = (ak1 · · · akn). Consequently Eq. (1) implies that
![]()
for each i = 1, . . . , n. But Ak v is the ith row of Av, while λvk is the kth element of the column vector λv, so we have
![]()
as desired.
![]()
Thus the problem of maximizing and minimizing q on Sn−1 involves the eigenvectors and eigenvalues of the associated linear mapping. Actually, it will turn out that if we can determine the eigenvalues without locating the eigenvectors first, then we need not be concerned with the latter.
Lemma 8.2 If
is an eigenvector of the linear mapping
associated with the quadratic form q, then
![]()
where λ is the eigenvalue associated with v.
PROOF

since
.
![]()
The fact that we need only determine the eigenvalues of L (and not the eigenvectors) is a significant advantage, because the following theorem asserts that we therefore need only solve the so-called characteristic equation
![]()
Here
A − λI
denotes the determinant of the matrix A − λI (where I is the n × n identity matrix).
Theorem 8.3 The real number λ is an eigenvalue of the linear mapping
, defined by L(x) = Ax, if and only if λ satisfies the equation
![]()
PROOF Suppose first that λ is an eigenvalue of L, that is,
![]()
for some v ≠ 0. Then (A − λI)v = 0.
Enumerating the rows of the matrix A − λI, we obtain

where e1, . . . , en are the standard unit vectors in
n. Hence the n vectors A1 − λe1, . . . , An − λen all lie in the (n − 1)-dimensional hyperplane through 0 perpendicular to v, and are therefore linearly dependent, so it follows from Theorem I.6.1 that
![]()
as desired.
Conversely, if λ is a solution of
A − λI
= 0, then the row vectors A1 − λe1, . . . , An − λen of A − λI are linearly dependent, and therefore all lie in some (n − 1)-dimensional hyperplane through the origin. If v is a unit vector perpendicular to this hyperplane, then
![]()
Hence (A − λI)v = 0, so
![]()
as desired.
![]()
Corollary 8.4 The maximum (minimum) value attained by the quadratic form q(x) = xtAx on Sn−1 is the largest (smallest) real root of the equation
![]()
PROOF If q(v) is the maximum (minimum) value of q on Sn−1, then v is an eigenvector of L(x) = Ax by Theorem 8.1, and q(v) = λ, the associated eigenvalue, by Lemma 8.2. Theorem 8.3 therefore implies that q(v) is a root of
A − λI
= 0. On the other hand, it follows from 8.2 and 8.3 that every root of
A − λI
= 0 is a value of q on Sn−1. Consequently q(v) must be the largest (smallest) root of
A − λI
= 0.
![]()
Example 1 Suppose a is a critical point of the function f:
and that the quadratic form of f at a is
![]()
so the matrix of q is
![]()
The characteristic equation of A is then

with roots λ = −1, 1, 3. It follows from Corollary 8.4 that the maximum and minimum values of q in S2 are + 3 and − 1 respectively. Since q attains both positive and negative values, it is nondefinite. Hence Theorem 7.5 implies that f has neither a maximum nor a minimum at a.
The above example indicates how Corollary 8.4 reduces the study of the quadratic form q(x) = xtAx to the problem of finding the largest and smallest real roots of the characteristic equation
A − λI
= 0. However this problem itself can be difficult. It is therefore desirable to carry our analysis of quadratic forms somewhat further.
Recall that the matrix A of a quadratic form is symmetric, that is, aij = aji. We say that the linear mapping
is symmetric if and only if L(x) · y = x · L(y) for all x and y in
n. The motivation for this terminology is the following result.
Lemma 8.5 If A is the matrix of the linear mapping
, then L is symmetric if and only if A is symmetric.
PROOF If
, then
![]()
Hence L is symmetric if and only if
![]()
for all i, j = 1, . . . , n. But
![]()
and![]()
The following theorem gives the precise character of a quadratic form q in terms of the eigenvectors and eigenvalues of its associated linear mapping L.
Theorem 8.6 If q is a quadratic form on
n, then there exists an orthogonal set of unit eigenvectors v1, . . . , vn of the associated linear mapping L. If y1, . . . , yn are the coordinates of
with respect to the basis v1, . . . , vn, that is,
![]()
then
![]()
where λ1, . . . , λn are the eigenvalues of v1, . . . , vn.
PROOF This is trivially true if n = 1, for then q is of the form q(x) = λx2, with L(x) = λx. We therefore assume by induction that it is true in dimension n − 1.
If q attains its maximum value on the unit sphere Sn−1 at
, then see Theorem 8.1 says that vn is an eigenvector of L. Fig. 2.40.

Figure 2.40
Let
be the subspace of
n perpendicular to vn. Then L maps
into itself, because, given
, we have

using the symmetry of L (Lemma 8.5).
Let
and
be the “restrictions” of L and q to
, that is,
![]()
If (u1, . . . , un−1) are the coordinates of x with respect to an orthonormal basis for
n−1, then by Lemma 8.5,
![]()
where B is a symmetric (n − 1) × (n − 1) matrix. Then

so q* is a quadratic form in u1, . . . , un−1.
The inductive hypothesis therefore gives an orthogonal set of unit eigenvectors v1, . . . , vn−1 for L*. Then v1, . . . , vn is the desired orthogonal set of unit eigenvectors for L.
To prove the last statement of the theorem, let
. Then

since vi · vj = 0 unless i = j, and vi · vi = 1.
![]()
Equation (2) may be written in the form

The process of finding an orthogonal set of unit eigenvectors, with respect to which the matrix of q is diagonal, is sometimes called diagonalization of the quadratic form q.
Note that, independently of Corollary 8.4, it follows immediately from Eq. (2) that
(a)q is positive-definite if all the eigenvalues are positive,
(b)q is negative-definite if all the eigenvalues are negative, and
(c)q is nondefinite if some are positive and some are negative.
If one or more eigenvalues are zero, while the others are either all positive or all negative, then q is neither positive-definite nor negative-definite nor nondefinite. According to Lemma 8.7 below, this can occur only if the determinant of the matrix of q vanishes.
Example 2 If q(x, y, z) = xy + yz, the matrix of q is

The characteristic equation is then
![]()
with solutions
. By substituting each of these eigenvalues in turn into the matrix equation
![]()
(from the proof of Theorem 8.3), and solving for a unit vector v, we obtain the eigenvectors

If u, v, w are coordinates with respect to coordinate axes determined by v1, v2, v3, that is, x = uv1 + vv2 + wv3, then Eq. (2) gives

Lemma 8.7 If λ1, . . . , λn are eigenvalues of the symmetric n × n matrix A, corresponding to orthogonal unit eigenvectors v1, . . . , vn, then
![]()
PROOF Let

Then, since Avj = λivj, we have
![]()
It follows that
![]()
Since
V
≠ 0 because the vectors v1, . . . , vn are orthogonal and therefore linearly independent, this gives Eq. (3).
![]()
We are finally ready to give a complete analysis of a quadratic form q(x) = xt Ax on
n for which
A
≠ 0. Writing A = (aij) as usual, we denote by Δk the determinant of the upper left-hand k × k submatrix of A, that is,

Thus
![]()
Theorem 8.8 Let q(x) = xt Ax be a quadratic form on
n with
A
≠ 0. Then q is
(a)positive-definite if and only if Δk >0 for each k = 1, . . . , n,
(b)negative-definite if and only if (−1)k Δk >0 for each k = 1, . . . , n,
(c)nondefinite if neither of the previous two conditions is satisfied.
PROOF OF (a) This is obvious if n = 1; assume by induction its truth in dimension n − 1.
Since 0 < Δn =
A
= λ1λ2 · · · λn by Lemma 8.7, all the eigenvalues λ1, . . . , λn of A are nonzero. Either they are all positive, in which case we are finished, or an even number of them are negative. In the latter case, let λi and λj be two negative eigenvalues, corresponding to eigenvectors vi and vj.
Let
be the 2-plane in
n spanned by vi and vj. Then, given
, we have

so q is negative on
minus the origin.
Now let q* :
n − 1
be the restriction of q to
n−1, defined by

where

Since Δ1 >0, . . . , Δn − 1 >0, the inductive assumption implies that q* is positive-definite on
n−1. But this is a contradiction, since the (n − 1)-plane
n−1 and the 2-plane
, on which q is negative, must have at least a line in common. We therefore conclude that q has no negative eigenvalues, and hence is positive-definite.
Conversely, if q is positive-definite, we want to prove that each Δk is positive. If m is the minimum value of q on Sn−1, then m >0.
Let qk :
k
be the restriction of q to
k, that is,

where Ak is the upper left-hand k × k submatrix of A, so Δk =
Ak
, and define Lk :
k
k by Lk(x) = Ak x. Then Lk is symmetric by Lemma 8.5.
Let ¼1, . . . , ¼k be the eigenvalues of Lk given by Theorem 8.6. Since by Lemma 8.2 each ¼i is the value of qk at some point of Sk−1, the unit sphere in
k, and qk(x) = q(x, 0), it follows that each
. Finally Lemma 8.7implies that
![]()
PROOF OF (b) Define the quadratic form q* :
n
by q*(x) = −q(x). Then clearly q is negative-definite if and only if q* is positive-definite. By part (a), q* is positive-definite if and only if each Δk* is positive, where Δk* =
−Ak
, Ak being the upper left-hand k × k submatrix of A. But, since each of the k rows of − Ak is −1 times the corresponding row of Ak, it follows that
![]()
PROOF OF (c) Since Δn = λ1 λ2 · · · λn ≠ 0, all of the eigenvalues λ1, . . . , λn are nonzero. If all of them were positive, then part (a) would imply that Δk >0 for all k, while if all of them were negative, then part (b) would imply that (−1)k Δk >0 for all k. Since neither is the case, we conclude that q has both positive eigenvalues and negative eigenvalues, and is therefore nondefinite.
![]()
Recalling that, if a is a critical point of the class
function f :
, the quadratic form of f at a is
, Theorems 7.5 and 8.8 enable us to determine the behavior of f in a neighborhood of a,provided that the determinant
![]()
often called the Hessian determinant of f at a, is nonzero. But if
Di Djf(a)
= 0, then these theorems do not apply, and ad hoc methods must be applied.
Note that the classical second derivative test for a function of two variables (Theorem 4.4) is an immediate corollary to Theorems 7.5 and 8.8.
It will clarify the picture at this point to generalize the discussion at the end of Section 6. Suppose for the sake of simplicity that the origin is a critical point of the class
function f:
, and that f(0) = 0. Then
![]()
with
![]()
Let v1, . . . , vn be an orthogonal set of unit eigenvectors (provided by Theorem 8.6), with associated eigenvalues λ1, . . . , λn. Assume that
Di Djf(0)
≠ 0, so Lemma 8.7 implies that each λi is nonzero. Let λ1, . . . , λk be the positive eigenvalues, and λk+1, . . . , λn the negative ones (writing k = 0 if all the eigenvalues are negative). Since
![]()
if x = y1v1 + · · · + yn vn, it is clear that q has a minimum at 0 if k = n and a maximum if k = 0.
If
, so that q has both positive and negative eigenvalues, denote by
the subspace of
spanned by v1, . . . , vk, and by
the subspace of
n spanned by vk+1, . . . , vn. Then the graph of z = q(x) in
and
looks like Fig. 2.41; q has a minimum on
and a maximum on
.
Given
< mini
λi
, choose δ >0 such that
![]()
or
![]()
since
. This inequality and Eq. (4) yield
![]()
Using (5), we obtain


Figure 2.41
where ¼i = −λi >0, i = k + 1, . . . , n. Since λi −
>0 and ¼i −
>0 because
< min
λi
, inequalities (6) show that, for
x
< δ, the graph of z = f(x) lies between two very close quadratic surfaces (hyperboloids) of the same type as z = q(x). In particular, if
, then f has a local minimum on the subspace
and a local maximum on the subspace
. This makes it clear that the origin is a “saddle point” for f if q has both positive and negative eigenvalues. Thus the general case is just like the 2-dimensional case except that, when there is a saddle point, the subspace on which the critical point is a maximum point may have any dimension from 1 to n − 1.
It should be emphasized that, if the Hessian determinant
DiDjf(a)
is zero, then the above results yield no information about the critical point a. This will be the situation if the quadratic form q of f at a is either positive-semidefinite but not positive-definite, or negative-semidefinite but not negative-definite. The quadratic form q is called positive-semidefinite if
for all x, and negative-semidefinite if
for all x. Notice that the terminology “q is nondefinite,” which we have been using, actually means that q is neither positive-semidefinite nor negative-semidefinite (so we might more descriptively have said “ nonsemidefinite”).
Example 3 Let f(x, y) = x2 − 2xy + y2 + x4 + y4, and g(x, y) = x2 − 2xy + y2 − x4 − y4. Then the quadratic form for both at the critical point (0, 0) is
![]()
which is positive-semidefinite but not positive-definite. Since
![]()
f has a relative minimum at (0, 0). However, since
![]()
it is clear that g has a saddle point at (0, 0).
Thus, if the Hessian determinant vanishes, then anything can happen.
We conclude this section with discussion of a “second derivative test” for constrained (Lagrange multiplier) maximum–minimum problems. We first recall the setting for the general Lagrange multiplier problem: We have a differentiable function f:
and a continuously differentiable mapping G :
m (m < n), and denote by M the set of all those points
at which the gradient vectors ∇G1(x), . . . , ∇Gm(x), of the component functions of G, are linearly independent. By Theorem 5.7, M is an (n − m)-dimensional manifold in
n, and by Theorem 5.6 has an (n − m)-dimensional tangent space Tx at each point
. Recall that, by definition, Tx is the subspace of
n consisting of all velocity vectors (at x) to differentiable curves on M which pass through x. This notation will be maintained throughout the present discussion.
We are interested in locating and classifying those points of M at which the function f attains its local maxima and minima on M. According to see Theorem 5.1, in order for f to have a local maximum or minimum on M at
, it is necessary that a be a critical point for f on M in the following sense. The point
is called a critical point for f on M if and only if the gradient vector ∇f(a) is orthogonal to the tangent space Ta of M at a. Since the linearly independent gradient vectors ∇G1(a), . . . , ∇Gm(a) generate the orthogonal complement to Ta (Fig. 2.42), it follows as in Theorem 5.8 that there exist unique numbers λ1, . . . , λm such that
![]()

Figure 2.42
We will obtain sufficient conditions for f to have a local extremum at a by considering the auxiliary function H :
for f at a defined by
![]()
Notice that (7) simply asserts that ∇H(a) = 0, so a is an (ordinary) critical point for H. We are, in particular, interested in the quadratic form
![]()
of H at a.
Theorem 8.9 Let a be a critical point for f on M, and denote by q :
the quadratic form at a of the auxiliary function
, as above. If f and G are both of class
in a neighborhood of a, then f has
(a)a local minimum on M at a if q is positive-definite on the tangent space Ta to M at a,
(b)a local maximum on M at a if q is negative-definite on Ta,
(c)neither if q is nondefinite on Ta.
The statement, that “q is positive-definite on Ta,” naturally means that q(h) >0 for all nonzero vectors
; similarly for the other two cases.
PROOF The proof of each part of this theorem is an extension of the proof of the corresponding part of Theorem 7.5. We give the details only for part (a).
We start with the Taylor expansion
![]()
for the auxiliary function H. Since
, it clearly suffices to find a δ >0 such that

if![]()
Let m be the minimum value attained by q on the unit sphere
in the tangent plane Ta. Then m >0 because q is positive-definite on Ta. Noting that

we choose δ >0 so small that

and also so small that 0 <
h
< δ and
together imply that h/
h
is sufficiently near to S* that
![]()
Then (9) follows as desired. The latter condition on δ can be satisfied because S* is (by definition) the set of all tangent vectors at a to unit-speed curves on M which pass through a. Therefore, if
and h is sufficiently small, it follows that h/
h
is close to a vector in S*.
![]()
The following two examples provide illustrative applications of Theorem 8.9.
Example 4 In Example 4 of Section 5 we wanted to find the box with volume 1000 having least total surface area, and this involved minimizing the function
![]()
on the 2-manifold
defined by the constraint equation
![]()
We found the single critical point a = (10, 10, 10) for f on M, with
. The auxiliary function is then defined by
![]()
We find by routine computation that the matrix of second partial derivatives of h at a is

so the quadratic form of h at a is
![]()
It is clear that q is nondefinite on
3. However it is the behavior of q on the tangent plane Ta that interests us.
Since the gradient vector ∇g(a) = (100, 100, 100) is orthogonal to M at a = (10, 10, 10), the tangent plane Ta is generated by the vectors
![]()
Given
, we may therefore write
![]()
and then find by substitution into (10) that
![]()
Since the quadratic form s2 + st + t2 is positive-definite (by the ac − b2 test), it follows that q is positive-definite on Ta. Therefore Theorem 8.9 assures us that f does indeed have a local minimum on M at the critical point a = (10, 10, 10).
Example 5 In Example 9 of Section 5 we sought the minimum distance between the circle x2 + y2 = 1 and the straight line x + y = 4. This involved minimizing the function
![]()
on the 2-manifold
defined by the constraint equations

We found the geometrically obvious critical points (see Fig. 2.43),
with
It is obvious that f has a minimum at a, and neither a maximum nor a minimum at b, but we want to verify this using Theorem 8.9.
The auxiliary function H is defined by
![]()

Figure 2.43
By routine computation, we find that the matrix of second partial derivatives of H is

Substituting
, we find that the quadratic form of H at the critical point
is
![]()
Computing the subdeterminants Δ1, Δ2, Δ3, Δ4 of Theorem 8.8, we find that all are positive, so this quadratic form q is positive-definite on
4, and hence on the tangent plane Ta (this could also be verified by the method of the previous example). Hence Theorem 8.9 implies that f does indeed attain a local minimum at a.
Substituting
, we find that the quadratic form of H at the other critical point
is
![]()
Now the vectors v1 = (1, −1, 0, 0) and v2 = (0, 0, 1, −1) obviously lie in the tangent plane Tb. Since
![]()
we see that q is nondefinite on Tb, so Theorem 8.9 implies that f does not have a local extremum at b.
Exercises
8.1This exercise gives an alternative proof of Theorem 8.8 in the 3-dimensional case. If

where A = (aij) is a symmetric 3 × 3 matrix with Δ1 ≠ 0, Δ2 ≠ 0, show that

Conclude that q is positive-definite if Δ1, Δ2, Δ3 are all positive, while q is negative-definite if Δ1 < 0, Δ2 >0, Δ3 < 0.
8.2Show that q(x, y, z) = 2x2 + 5y2 + 2z2 + 2xz is positive-definite by
(a)applying the previous exercise,
(b)solving the characteristic equation.
8.3Use the method of Example 2 to diagonalize the quadratic form of Exercise 8.2, and find its eigenvectors. Then sketch the graph of the equation 2x2 + 5y2 + 2z2 + 2xz = 1.
8.4This problem deals with the function
defined by
![]()
Note that 0 = (0, 0, 0) is a critical point of f.
(a)Show that q(x, y, z) = 2x2 +5y2 + 2z2 + 2xz is the quadratic form of f at 0 by substituting the expansion
, collecting all second degree terms, and verifying the appropriate condition on the remainder
(x, y, z). State the uniqueness theorem which you apply.
(b)Write down the symmetric 3 × 3 matrix A such that

By calculating determinants of appropriate submatrices of A, determine the behavior of f at 0.
(c)Find the eigenvalues λ1, λ2, λ3 of q. Let v1, v2, v3 be the eigenvectors corresponding to λ1, λ2, λ3 (do not solve for them). Express q in the uvw-coordinate system determined by v1, v2, v3. That is, write q(uv1 + uv2 + wv3) in terms of u, v, w. Then give a geometric description (or sketch) of the surface
![]()
8.5Suppose you want to design an ice-cube tray of minimal cost, which will make one dozen ice “cubes” (not necessarily actually cubes) from 12 in.3 of water. Assume that the tray is divided into 12 square compartments in a 2 × 6 pattern as shown in Fig. 2.44, and that

Figure 2.44
the material used costs 1 cent per square inch. Use the Lagrange multiplier method to minimize the cost function f(x, y, z) = xy + 3xz + 7yz subject to the constraints x = 3y and xyz = 12. Apply Theorem 8.9 to verify that you do have a local minimum.
8.6Apply Theorem 8.8 to show that the quadratic form
is nondefinite. Solve the characteristic equation for its eigenvalues, and then find an ortho-normal set of eigenvectors.
8.7Let f(x) = xt Ax be a quadratic form on
n (with A symmetric), and let v1, . . . , vn be the orthonormal eigenvectors given by Theorem 8.6. If P = (v1, . . . , vn) is the n × n matrix having these eigenvectors as its column vectors, show that

Hint: Let x = Py = y1v1 + · · · + yn vn. Then
![]()
by the definition of f, while
![]()
by Theorem 8.6. Writing B = PtAP, we have
![]()
for all y = (y1, . . . , yn). Using the obvious symmetry of B prove that bii = λi, while bij = 0 if i ≠ j. Finally note that P−1 = Pt by Exercise 6.11 of Chapter I.