TWO FUNDAMENTAL THEOREMS ON CONTINUOUS FUNCTIONS - FUNCTIONS AND LIMITS - What Is Mathematics? An Elementary Approach to Ideas and Methods, 2nd Edition (1996)

What Is Mathematics? An Elementary Approach to Ideas and Methods, 2nd Edition (1996)

CHAPTER VI. FUNCTIONS AND LIMITS

§5. TWO FUNDAMENTAL THEOREMS ON CONTINUOUS FUNCTIONS

1. Bolzano’s Theorem

Bernard Bolzano (1781-1848), a Catholic priest trained in scholastic philosophy, was one of the first to introduce the modern concept of rigor into mathematical analysis. His important booklet, Paradoxien des Unendlichen, appeared in 1850. Here for the first time it was recognized that many apparently obvious statements concerning continuous functions can and must be proved if they are to be used in full generality. The following theorem on continuous functions of one variable is an example.

A continuous function of a variable x which is positive for some value of x and negative for some other value of x in a closed interval a ≤ x ≤ b of continuity must have the value zero for some intermediate value of x. Thus, if f(x) is continuous as x varies from a to b, while f(a) < 0 and f(b) > 0, then there will exist a value α of x such that a < α < b and f(α) = 0.

Bolzano’s theorem corresponds perfectly with our intuitive idea of a continuous curve, which, in order to get from a point below the x-axis to a point above, must somewhere cross the axis. That this need not be true of discontinuous functions is shown by Figure 157 on page 284.

*2. Proof of Bolzano’s Theorem

A rigorous proof of this theorem will be given. (Like Gauss and other great mathematicians, one may accept and use the fact without proof.) Our objective is to reduce the theorem to fundamental properties of the real number system, in particular to the Dedekind-Cantor postulate concerning nested intervals (p. 68). To this end we consider the interval I, axb, in which the function f(x) is defined, and bisect it by marking the mid-point, image. If at this mid-point we find that f(x1) = 0, then there remains nothing further to prove. If, however, f(x1) ≠ 0, then f(x1) must be either greater than or less than zero. In either case one of the halves of I will again have the property that the sign of f(x) is different at its two extremes. Let us call this interval I1. We continue the process by bisecting I1; then either f(x) = 0 at the midpoint of I1, or we can choose an interval I2, half of I1, with the property that the sign of/(x) is different at its two extremes. Repeating this procedure, either we shall find after a finite number of bisections a point for which f(x) = 0, or we shall obtain a sequence of nested intervals I1,I2, I3,· · ·. In the latter case, the Dedekind-Cantor postulate assures the existence of a point a in I common to all these intervals. We assert that f(α) = 0, so that α is the point whose existence proves the theorem.

image

Fig. 172. Bolzano’s theorem.

So far the assumption of continuity has not been used. It now serves to clinch the argument by a bit of indirect reasoning. We shall prove that f(α) = 0 by assuming the contrary and deducing a contradiction. Suppose that f(α) ≠ 0, e.g. that f(α) = 2ε > 0. Since f(x) is continuous, we can find a (perhaps very small) interval J of length 2δ with α as midpoint, such that the value of f(x) everywhere in J differs from f(α) by less than ε. Hence, since f(α) = 2 ε, we can be sure that f(x) > ε everywhere in J, so that f(x) > 0 in J. But the interval J is fixed, and if n is sufficiently large the little interval In must necessarily fall within J, since the sequence In shrinks to zero. This yields the contradiction; for it follows from the way In was chosen that the function f(x) has opposite signs at the two endpoints of every In, so that f(x) must have negative values somewhere in J. Thus the absurdity of f(α) >0 and (in the same way) of f(α) < 0 proves that f(α) = 0.

3. Weierstrass’ Theorem on Extreme Values

Another important and intuitively plausible fact about continuous functions was formulated by Karl Weierstrass (1815-1897), who, perhaps more than anyone else, was responsible for the modern trend towards rigor in mathematical analysis. This theorem states: If a function f(x) is continuous in an interval I, a≤ x ≤b, including the end-points a and b of the interval, then there must exist at least one point in I where f(x) attains its largest value M, and another point where f(x) attains its least value m. Intuitively speaking, this means that the graph of the continuous function u = f(x) must have at least one highest and one lowest point.

It is important to observe that the statement need not be true if the function f(x) fails to be continuous at the endpoints of I. For example, the function image has no largest value in the interval 0 < x ≤ 1, although f(x) is continuous throughout the interior of this interval. Nor need a discontinuous function assume a greatest or a least value even if it is bounded. For example, consider the very discontinuous function f(x) defined by setting

image

in the interval 0 ≤ x ≤ 1. This function always takes on values between 0 and 1, in fact values as near to 1 and 0 as we may wish, if x is chosen as an irrational number sufficiently near to 0 or 1. But f(x) can never be equal to 0 or 1, since for rational x we have image, and for irrational x we have f(x) = x. Hence 0 and 1 are never attained.

* Weierstrass’ theorem can be proved in much the same way as Bolzano’s theorem. We divide I into two closed half-intervals I’ and I” and fix our attention on I’ as the interval in which the greatest value of f(x) must be sought, unless there is a point a in I” such that f(α) exceeds all the values of f(x) in I’; in the latter case we select I”. The interval so selected we call I1. Now we proceed with I1 in the same say as we did with I, obtaining an interval I2, and so on. This process will define a sequence I1, I2, · · ·, In, · · · of nested intervals all containing a point z. We shall prove that the value f(z) = M is the largest attained by f(x) in I, i.e. that there cannot be a point s in I for which f(s) > M. Suppose there were a point s with f(s) = M + 2ε, where ο is a (perhaps very small) positive number. Around z as center we can, because of the continuity of f(x), mark off a small interval K, leaving s outside, and such that in K the values of f(x) differ from f(z) = M by less than ε, so that we certainly have f(x) < M + ε in K. But for sufficiently large n the interval In lies inside K, and In was so defined that no value of f(x) for x outside In can exceed all the values of f(x)for x in In. Since s is outside In and f(s) > M + ε, while in K, and hence in In, we have f(x) < M + ε, we have arrived at a contradiction.

The existence of a least value m may be proved in the same way, or it follows directly from what has already been proved, since the least value of f(x) is the greatest value of g(x) = –f(x).

Weierstrass’ theorem can be proved in a similar way for continuous functions of two or more variables x, y, · · ·. Instead of an interval with its endpoints we have to consider a closed domain, e.g. a rectangle in the x, y-plane which includes its boundary.

Exercise: Where, in the proofs of Bolzano’s and Weierstrass’ theorems, did we use the fact that f(x) was assumed to be defined and continuous in the whole closed interval α ≤ xb and not merely in the interval a < xb or a < x < b?

The proofs of Bolzano’s and Weierstrass’ theorems have a decidedly non-constructive character. They do not provide a method for actually finding the location of a zero or of the greatest or smallest value of a function with a prescribed degree of precision in a finite number of steps. Only the mere existence, or rather the absurdity of the non-existence, of the desired values is proved. This is another important instance where the “intuitionists” (see p. 86) have raised objections; some have even insisted that such theorems be eliminated from mathematics. The student of mathematics should take this no more seriously than did most of the critics.

*4. A Theorem on Sequences. Compact Sets

Let x1, x2,x3, · · · be any infinite sequence of numbers, distinct or not, all contained in the closed interval I, axb. The sequence may or may not tend to a limit. But in any case, it is always possible to extract from such a sequence, by omitting certain of its terms, an infinite subsequence, y1, y2, y3, · · ·, which tends to some limit y contained in the interval I.

To prove this theorem we divide the interval I into two closed sub-intervals I′ and I″ by marking the midpoint image of I:

image

In at least one of these, which we may call I1, there must lie infinitely many terms xn of the original sequence. Choose any one of these terms, say xn1, and call it y1. Now proceed in the same way with the interval I1. Since there are infinitely many terms xn in I1, there must be infinitely many terms in at least one of the halves of I1, which we may sail I2. Hence we can certainly find a term xn in I2 for which n > n1. Choose some one of these, and call it y2. Proceeding in this way, we can find a sequence I1, I2, I3, · · · of nested intervals and a subsequence y1, y2, y3, · · · of the original sequence, such that yn lies in In for every n. This sequence of intervals closes down on a point y of I, and it is clear that the sequence y1, y2, y3, · · · has the limit a, as was to be proved.

* These considerations are capable of the type of generalization that is typical of modern mathematics. Let us consider a variable X ranging over a general set S in which some notion of “distance” is defined. S may be a set of points in the plane or in space. But this is not necessary; for example, S might be the set of all triangles in the plane. If X and Y are two triangles, with vertices A, B, C and A′, B′, C′ respectively, then we can define the “distance” between the two triangles as the number

d(X, Y) = AA′ + BB′ + CC′,

where AA’, etc. denotes the ordinary distance between the points A and A’. Whenever there exists such a notion of “distance” in a set S we may define the concept of a sequence of elements X1, X2, X3, … tending to a limit element X of S. By this we mean that d(X, Xn) → 0 as n → ∞. We shall now say that the set S is compact if from any sequence X1, X2, X3, … of elements of S we can always extract a subsequence which tends to some limit element X of S. We have shown in the preceding paragraph that a closed interval α ≤ x ≤ b is compact in this sense. Hence the concept of a compact set may be regarded as a generalization of a closed interval of the number axis. Note that the number axis as a’whole is not compact, since the sequence of integers 1, 2, 3, 4, 5, · · · neither tends to a limit nor contains any subsequence that does. Nor is an open interval such as 0 < x < 1, not including its endpoints, compact, since the sequence image or any subsequence of it tends to the limit 0, which is not a point of the open interval. In the same way it may be shown that the region of the plane consisting of the points interior to a square or rectangle is not compact, but becomes compact if the boundary points are added. Furthermore, the set of all triangles whose vertices lie within or on the circumference of a given circle is compact.

We may also extend the notion of continuity to the case where the variable X ranges over any set S in which the notion of limit is defined. The function u = F(x), where u is a real number, is said to be continuous at the element X if, for any sequence of elements X1, X2, X3, · · · which tends to X as limit, the corresponding sequence of numbers F(X1), F(X2) · · · tends to the limit F(X). (An equivalent (ε, δ)-definition could also be given.) It is quite easy to show that Weierstrass’ theorem also holds in the general case of a continuous function defined over the elements of any compact set:

If u = F(x) is any continuous function defined on a compact set S, then there always exists an element of S for which F(x) attains its largest value, and also one for which it attains its smallest value.

The proof is simple once one has grasped the general concepts involved, but we shall not go further into this subject. It will appear in Chapter VII that the general theorem of Weierstrass is of great importance in the theory of maxima and minima.