THE EXISTENCE OF AN EXTREMUM. DIRICHLET’S PRINCIPLE - MAXIMA AND MINIMA - What Is Mathematics? An Elementary Approach to Ideas and Methods, 2nd Edition (1996)

What Is Mathematics? An Elementary Approach to Ideas and Methods, 2nd Edition (1996)

CHAPTER VII. MAXIMA AND MINIMA

§7. THE EXISTENCE OF AN EXTREMUM. DIRICHLET’S PRINCIPLE

1. General Remarks

In some of the previous extremum problems the solution is directly demonstrated to give a better result than any of its competitors. A striking instance is Schwarz’s solution of the triangle problem, where we could see at once that no inscribed triangle has a perimeter smaller than that of the altitude triangle. Other examples are the minimum or maximum problems whose solutions depend on an explicit inequality, such as that between the arithmetical and geometrical means. But in some of our problems we followed a different path. We began with the assumption that a solution had been found; then we analyzed this assumption and drew conclusions which eventually permitted a description and construction of the solution. This was the case, for example, with the solution of Steiner’s problem and with the second treatment of Schwarz’s problem. The two methods are logically different. The first one is, in a way, more perfect, since it gives a more or less constructive demonstration of the solution. The second method, as we saw in the case of the triangle problem, is likely to be simpler. But it is not so direct, and it is, above all, conditional in its structure, for it starts with the assumption that a solution to the problem exists. It gives the solution only provided that this is granted or proved. Without this assumption it merely shows that if a solution exists, then it must have a certain character.

Because of the apparent obviousness of the premise that a solution exists, mathematicians until late in the nineteenth century paid no attention to the logical point involved, and assumed the existence of a solution to extremum problems as a matter of course. Some of the greatest mathematicians of the nineteenth century—Gauss, Dirichlet, and Riemann—used this assumption indiscriminately as the basis for deep and otherwise hardly accessible theorems in mathematical physics and the theory of functions. The climax came when, in 1849, Riemann published his doctoral thesis on the foundations of the theory of functions of a complex variable. This concisely written paper, one of the great pioneering achievements of modern mathematics, was so completely unorthodox in its approach to the subject that many people would have liked to ignore it. Weierstrass was then the foremost mathematician at the University of Berlin and the acknowledged leader in the building of a rigorous function theory. Impressed but somewhat doubtful, he soon discovered a logical gap in the paper which the author had not bothered to fill. Weierstrass’ shattering criticism, though it did not disturb Riemann, resulted at first in an almost general neglect of his theory. Riemann’s meteoric career came to a sudden end after a few years with his death from consumption. But his ideas always found some enthusiastic disciples, and fifty years after the publication of his thesis Hilbert finally succeeded in opening the way for a complete answer to the questions that he had left unsettled. This whole development in mathematics and mathematical physics became one of the great triumphs in the history of modern mathematical analysis.

In Riemann’s paper the point open to critical attack is the question of the existence of a minimum. Riemann based much of his theory on what he called Dirichlet’s principle (Dirichlet had been Riemann’s teacher at Goettingen, and had lectured but never written about this principle.) Let us suppose, for example, that part of a plane or of any surface is covered with tinfoil and that a stationary electric current is set up in the layer of tinfoil by connecting it at two points with the poles of an electric battery. There is no doubt that the physical experiment leads to a definite result. But how about the corresponding mathematical problem, which is of the utmost importance in function theory and other fields? According to the theory of electricity, the physical phenomenon is described by a “boundary value problem of a partial differential equation”. It is this mathematical problem that concerns us; its solvability is made plausible by its assumed equivalence to a physical phenomenon but is by no means mathematically proved by this argument. Riemann disposed of the mathematical question in two steps. First he showed that the problem is equivalent to a minimum problem: a certain quantity expressing the energy of the electric flow is minimized by the actual flow in comparison to the other flows possible under the prescribed conditions. Then he stated as “Dirichlet’s principle” that such a minimum problem has a solution. Riemann took not the slightest step towards a mathematical proof of the second assertion, and this was the point attacked by Weierstrass Not only was the existence of the minimum not at all evident, but, as it turned out, it was an extremely delicate question for which the mathematics of that time was not yet prepared and which was finally settled only after many decades of intensive research.

2. Examples

We shall illustrate the sort of difficulty involved by two examples. 1) We mark two points A and B at a distance d on a straight line L, and ask for the polygon of shortest length that starts at A in a direction perpendicular to L and ends at B. Since the straight segment AB is the shortest connection between A and B for all paths, we can be certain that any path admissible in the competition has a length greater than d, for the only path giving the value d is the straight segment AB, which violates the restriction imposed on the direction at A, and hence is not admissible under the terms of the problem. On the other hand, consider the admissible path AOB in Figure 222. If we replace O by a point O’ near enough to A, we can obtain an admissible path with a length differing as little from d as we like; hence if a shortest admissible path exists, it cannot have a length exceeding d and must therefore have the exact length d. But the only path of that length is not admissible, as we saw. Hence there can exist no shortest admissible path, and the proposed minimum problem has no solution.

image

Fig. 222.

2) As in Figure 223, let C be a circle and S a point at a distance 1 above its center. Consider the class of all surfaces bounded by C that go through the point S and lie above C in such a way that no two different points have the same vertical projection on the plane of C. Which of these surfaces has the least area? This problem, natural as it appears, has no solution: there is no admissible surface with a minimum area. If the condition that the surface go through S had not been prescribed, the solution would obviously be the plane circular disk bounded by C. Let us denote its area by A. Any other surface bounded by C must have an area larger than A. But we can find an admissible surface whose area exceeds A by as little as we please. For this purpose we take a conical surface of height 1 and so slender that its area is less than whatever margin may have been assigned. We place this cone on top of the disk with its vertex at S, and consider the total surface formed by the surface of the cone and the part of the disk outside the base of the cone. It is immediately clear that this surface, which deviates from the plane only near the center, has an area exceeding A by less than the given margin. Since this margin can be chosen as small as we like, it follows again that the minimum, if it exists, cannot be other than the area A of the disk. But among all the surfaces bounded by C only the disk itself has this area, and since the disk does not go through S it violates the conditions for admissibility. As a consequence, the problem has no solution.

image

Fig. 223.

We can dispense with the more sophisticated examples given by Weierstrass. The two just considered show well enough that the existence of a minimum is not a trivial part of a mathematical proof. Let us put the matter in more general and abstract terms. Consider a definite class of objects. e.g. of curves or surfaces, to each of which is attached as a function of the object a certain number, e.g. length or area. If there is only a finite number of objects in the class, there must obviously be a largest and a smallest among the corresponding numbers. But if there are infinitely many objects in the class, there need be neither a largest nor a smallest number, even if all these numbers are contained between fixed bounds. In general, these numbers will form an infinite set of points on the number axis. Let us suppose, for simplicity, that all the numbers are positive. Then the set has a “greatest lower bound”, that is, a point a below which no number of the set lies, and which is either itself an element of the set or is approached with any degree of accuracy by members of the set. If α belongs to the set, it is the smallest element; otherwise the set simply does not contain a smallest element. For example, the set of numbers 1, 1/2, 1/3,... contains no smallest element, since the lower bound, 0, does not belong to the set. These examples illustrate in an abstract way the logical difficulties connected with the existence problem. The mathematical solution of a minimum problem is not complete until one has provided, explicitly or implicitlz, a proof that the set of values associated with the problem contains a smallest element.

3. Elementary Extremum Problems

In elementary problems it requires only an attentive analysis of the basic concepts involved to settle the question of the existence of a solution. In Chapter VI, §5 the genaral notion of a compact set was discussed; it was stated that a continuous function defined for the elements of a compact set always assumes a largest and a smallest value somewhere in the set. In each of the elementary problems previously discussed, the competing values can be regarded as the values of a function of one or severval ariables in a domain that is either compact or can easily be made so without essential change in the problem. In such a case the existence of a maximum and a minimum is assured. In Steiner’s problem, for example, the quantity under consideration is the sum of three distances, and this depends continuously on the position of the movable point. Since the domain of this point is the whole plane, nothing is lost if we enclose the figure in a large circle and restrict the point to its interior and boundary. For as soon as the movable point is sufficiently far away from the three given points, the sum of its distances to these points will certainly exceedAB + AC, which is one of the admissible values of the function. Hence if there is a minimum for a point restricted to a large circle, this will also be the minimum for the unrestricted problem. But it is easy to show that the domain consisting of a circle plus its interior is compact, hence a minimum for Steiner’s problem exists.

The importance of the assumption that the domain of the independent variable is compact can be shown by the following example. Given two closed curves C1 and C2, there always exist two points. P1, P2 on C1, C2 respectively, which have the least possible distance from each other, and points Q1, Q2 which have the largest possible distance. For the distance between a point A1 on C1 and a point A2 on C2 is a continuous function on the compact set consisting of the pairs A1, A2 of points under consideration. However, if the two curves are not bounded but extend to infinity, then the problem may not have a solution. In the case shown in Figure 224 neither a smallest nor a largest distance between the curves is attained; the lower bound for the distance is zero, the upper bound is infinity, and neither is attained. In some cases a minimum but no maximum exists. For the case of two branches of a hyperbola (Fig. 17, p. 76) only a minimum distance is attained, by A and A’, since obviously no two points exist with a maximum distance apart.

image

Fig. 224. Curves between which there is no longest or shortest distance.

We can account for this difference in behavior by artificially restricting the domain of the variables. Select an arbitrary positive number R, and restrict x by the condition |x| ≤ R. Then both a maximum and a minimum exist for each of the last two problems. In the first one, restricting the boundary in this way assures the existence of a maximum and a minimum distance, both of which are attained on the boundary. If R is increased, the points for which the extrema are attained are again on the boundary. Hence as R increases, these points disappear towards infinity. In the second case, the minimum distance is attained in the interior, and no matter how much R is increased the two points of minimum distance remain the same.

4. Difficulties in Higher Cases

While the existence question is not at all serious in the elementary problems involving one, two, or any finite number of independent variables, it is quite different with Dirichlet’s principle or with even simpler problems of a similar type. The reason in these cases is either that the domain of the independent variable fails to be compact, or that the function fails to be continuous. In the first example of Article 2 we have a sequence of paths AO′B where O′ tends to the point A. Each path of the sequence satisfies the conditions of admissibility. But the paths AO′B tend to the straight segment AB and this limit is no longer in the admitted set. The set of admissible paths is in this respect like the interval 0 < x ≤ 1 for which Weierstrass’ theorem on extreme values does not hold (see p. 314). In the second example we find a similar situation: if the cones become thinner and thinner, then the sequence of the corresponding admissible surfaces will tend to the disk plus a vertical straight line reaching to S. This limiting geometrical entity, however, is not among the admissible surfaces, and again it is true that the set of admissible surfaces is not compact.

As an example of non-continuous dependence we may consider the length of a curve. This length is no longer a function of a finite number of numerical variables, since a whole curve cannot be characterized by a finite number of “coordinates,” and it is not a continuous function of the curve. To see this let us join two points A and B at a distance d by a zigzag polygon Pn which together with the segment AB forms n equilateral triangles. It is clear from Figure 225 that the total length of Pn will be exactly 2d for every value of n. Now consider the sequence of polygons P1,P2, ···. The single waves of these polygons decrease in height as they increase in number, and it is clear that polygon Pn tends to the straight line AB, where, in the limit, the roughness has disappeared completely. The length of Pn is always 2d, regardless of the index n, while the length of the limiting curve, the straight segment, is only d. Hence the length does not depend continuously on the curve.

image

Fig. 225. Approximation to a segment by polygons of twice its length.

All these examples confirm the fact that caution as to the existence of a solution is really necessary in minimum problems of a more complex structure.