NONSTANDARD ANALYSIS - RECENT DEVELOPMENTS - What Is Mathematics? An Elementary Approach to Ideas and Methods, 2nd Edition (1996)

What Is Mathematics? An Elementary Approach to Ideas and Methods, 2nd Edition (1996)

CHAPTER IX. RECENT DEVELOPMENTS

§12. NONSTANDARD ANALYSIS

On page 435 Courant and Robbins remark that “‘differentials’ as infinitely small quantities are now definitely and dishonorably discarded,” an accurate reflection of the consensus view when What Is Mathematics? was written. Despite Courant and Robbins’s verdict, there has always been something intuitive and appealing about the old-style arguments with infinitesimals. They are still embedded in our language, in ideas such as “instants” of time, “instantaneous” velocities, a curve as a series of infinitely small straight lines, the area bounded by a curve as an infinite sum of areas of infinitesimal rectangles. This kind of intutition turns out to be justified, for it has recently been discovered that the concept of infinitely small quantities is not dishonorable and need not be discarded at all. It is possible to set up a rigorous framework for analysis in which the Weierstrassian epsilon-delta definitions (see p. 305) are replaced by statements about infinitesimals that look astonishingly similar to the intuitive ideas of Leibniz, Newton, and Cauchy.

The way to make infinitesimals respectable is called nonstandard analysis. It is entirely viable as an alternative to the epsilon-delta approach, but for several reasons—only one being scientific conservatism—most mathematicians still prefer Weierstrass’s point of view. The big psychological problem is that setting up such a framework involves sophisticated ideas from modern mathematical logic. Between about 1920 and 1950 there was a great explosion of mathematical logic. One of the topics that emerged was model theory, which constructs and characterizesmodels of axiom systems—mathematical structures that obey those axioms. Thus the coordinate plane is a model for the axioms of Euclidean geometry, Poincaré’s disk (p. 223) is a model for the axioms of hyperbolic geometry, and so on.

There is a standard axiom system for the real numbers, and it has long been known that there is a unique model, the standard real numbers R. This is one reason why different ways of constructing “the” real numbers (see pp. 68, 69) lead to number systems that are effectively identical. Moreover, R does not contain any infinitesimals or infinities. So how is it possible to apply model theory to construct a “nonstandard” real number system that does contain these strange objects? Logicians distinguish between “first order” and “second order” axiomatic systems. In a first order theory the axioms express properties required of all objects in the system, but not of all sets of objects. In a second order theory there is no such restriction. In ordinary arithmetic, a statement such as

(8) x + y = y + x for all x and y

is first order, and so are all the usual laws of algebra; but the “Archimedean axiom”

(9) if x < 1/n for all natural numbers n then x = 0

is second order. Most of the usual axioms for the real numbers are first order, but the list includes some that are second order. In fact the second order axiom (9) is the crucial one that rules out both infinitesimals and infinities in R. However, it turns out that if the axioms are weakened to comprise only the first order properties of R, then other models exist, including some that violate (9) above. Let R* be such a model and call it the system of hyperreal numbers. This idea, the basis of nonstandard analysis, was discovered by Abraham Robinson around 1960. We have already seen that there are non-Euclidean geometries and non-Cantorian set theories; now we find that there are non-Archimedean number systems.

The set R* contains several important subsets. There is a set of “standard” natural numbers N = {0, 1, 2, 3,...}, and there is also a larger system of “nonstandard” natural numbers N*. There are the standard integers Z and a corresponding extension to nonstandard integers Z*. There are the standard rationals Q, and a corresponding extension to nonstandard rationals Q*. And there are standard reals R and nonstandard reals (or hyperreals) R*.

Every first order property of R has a unique natural extension to R*. However, (9) expresses a second order property, and it is false in R*. The hyperreals contain actual infinities, actual infinitesimals. For example xR* is infinitesimal if and only if x ≠ 0 and x < 1/n for all n ∈ N. The usual argument that “infinitesimals do not exist” actually proves that real infinitesimals do not exist; that is, that the infinitesimals in R* do not belong to R. But that is entirely reasonable, because R* is bigger than R. Incidentally, the “correct” analogue of (9) in R* is

(10) if x < 1/n for all nN* then x = 0,

and this is true. So changing (9) to refer to the nonstandard natural numbers instead of the standard ones makes a big difference.

The extension from reals to hyperreals is just one further example of the ancient game of extending the number system in order to secure a desirable property (see pp. 52–63). For example, the rational numbers were extended to the reals to allow 2 to have a square root; and the real numbers were extended to the complex numbers to allow —1 to have a square root. So why not extend from real numbers to hyperreal numbers to allow infinitesimals to exist?

We can use R* to prove theorems about R, because the number systems R and R* are indistinguishable as far as first order properties are concerned. However, R* has all sorts of new features, such as infinitesimals and infinities, which can be exploited in new ways. These new features are second order properties, which is why the new systems can have them even though the old ones cannot. Similar remarks apply to the subsystems N and N*, Z and Z*, and Q and Q*.

A few definitions will give the flavor of the approach. A hyperreal number is finite if it is smaller than some standard real. It is infinitesimal if it is smaller than all positive standard reals. Anything not finite is infinite, and anything not in R is nonstandard. If x is infinitesimal then 1/x is infinite, and vice versa.

None of this would be of any great importance if all that could be done was invent a new number system. But even though R and R* are different, they are intimately connected. In fact, every finite hyperreal x has a unique standard part std(x) which is infinitely close to x, that is, x - std(x)is infinitesimal. In other words, each finite hyperreal has a unique expression as “standard real plus infinitesimal.” It is as if each standard real is surrounded by a cloud of infinitely close hyperreals, often called its halo. And each such halo surrounds a single real, which for some obscure reason is usually called its shadow, although a word like “core” or “center” would convey the image better. By using the standard part we can transfer properties from R* to R, or vice versa.

To see how proofs in nonstandard analysis differ from their standard counterparts, consider Leibniz’s calculation of the derivative of the function y = f(x) = x2. What he does is take a small number Δx and form the ratio [f(x + Δx) – f(x)]/ Δx. (Newton’s approach was basically the same, except that he used the symbol o in place of Δx.) Following Leibniz we calculate:

image

Leibniz then argued that since Δx: is infinitesimal, it can be ignored, leaving 2x. However, Δx must be nonzero in order for [f(x + Δ x) -f(x)]/ Δx to make sense, in which case 2x + Δx is not equal to 2x. It was this difficulty that led Bishop Berkeley to write his famous critique The Analyst, Or a Discourse Addressed to an Infidel Mathematician, in which he pointed out some logical inconsistencies in the foundations of the calculus.

Weierstrass overcame Berkeley’s objections by adding one final step: take the limit as Δx tends to zero. (Both Leibniz and Newton had expressed similar ideas, but not with the same crystal clarity as Weierstass’s ε and δ.) Because nonzero values of Δx can tend to zero, we may assume all values of Δx that are encountered during the calculation are nonzero, so that dividing by Δx is meaningful. Then we take the limit as Δx→0 to get rid of that awkward extra term Δx and leave the required answer 2x.

In nonstandard analysis there is a simpler way. Take x to be finite and standard (that is, let xR) and assume that Δx is a genuine infinitesimal. Instead of 2x + Δx take its standard part std(2x + Δx), which is 2x. In other words, define the derivative of f(x) to be

image

where x is a standard real and Δx is any infinitesimal. The innocentlooking idea of the standard part is exactly what is needed to make the derivative a real function of x instead of a hyperreal function of x and Δx. It is a perfectly rigorous way of removing the Δx term, because std(x) is a uniquely defined real. Instead of the extra Δx being swept under the carpet with much special pleading, it is neatly expunged.

A course in nonstandard analysis looks like an extended parade of exactly those errors that Courant and Robbins spend so many pages teaching us to avoid. For example:

1. A sequence sn converges to a limit L if sωL is infinitesimal for all infinite ω. (Compare with p. 291.)

2. A function f is continuous at x if f(x + ε) is infinitely close to f(x) (that is, f(x + ε) – f(x) is infinitesimal) for all infinitesimal ε. (Compare with p. 310.)

3. The function f has derivative d at x if and only if [f(x + Δ x) – f(x)]/Δx is infinitely close to d for all infinitesimals Δx. (Compare with p. 417.)

4. The area of a curved region is an infinite sum of infinitesimal rectangles. (Compare with p. 405.)

However, within the framework of nonstandard analysis these statements can be given a rigorous meaning.

In fact, nonstandard analysis does not lead to any conclusions about R that differ from standard analysis. It is easy to conclude from this that there is no point in using the nonstandard approach, because “it does not lead to anything new.” But this criticism is not conclusive: the question is not “does it give the same results?” so much as “is it a simpler or more natural way to derive those results?” As Newton showed in his Principia, anything that can be proved with calculus can also be proved by classical geometry. In no way does this imply that calculus is worthless, and the same goes for nonstandard analysis.

Experience suggests that proofs via nonstandard analysis are usually shorter and more direct than the classical epsilon-delta proofs. This is because they avoid complicated estimates of the sizes of things, which form the bulk of the classical proof. The main obstacle to the widespread adoption of nonstandard analysis is that its appreciation requires a background with an emphasis on mathematical logic—very different from traditional analysis.