THE DERIVATIVE - THE CALCULUS - What Is Mathematics? An Elementary Approach to Ideas and Methods, 2nd Edition (1996)

What Is Mathematics? An Elementary Approach to Ideas and Methods, 2nd Edition (1996)

CHAPTER VIII. THE CALCULUS

§2. THE DERIVATIVE

1. The Derivative as a Slope

While the concept of integral has its roots in antiquity, the other basic concept of the calculus, the derivative, was formulated only in the seventeenth century by Fermat and others. It was the discovery by Newton and Leibniz of the organic interrelation between these two seemingly quite diverse concepts that inaugurated an unparalleled development of mathematical science.

Fermat was interested in determining the maxima and minima of a function y = f(x). In a graph of the function, a maximum corresponds to a summit higher than all other neighboring points, while a minimum corresponds to the bottom of a valley lower than all neighboring points. InFigure 191 on page 342 the point B is a maximum and the point C a minimum. To characterize the points of maximum and minimum it is natural to use the notion of tangent of a curve. We assume that the graph has no sharp corners or other singularities, and that at every point it possesses a definite direction given by a tangent line. At maximum or minimum points the tangent of the graph y = f(x) must be parallel to the x-axis, since otherwise the curve would be rising or falling at these points. This remark suggests the idea of considering quite generally, at any point P of the graph y = f(x), the direction of the tangent to the curve.

To characterize the direction of a straight line in the x, y-plane it is customary to give its slope, which is the trigonometrical tangent of the angle α from the direction of the positive x-axis to the line. If P is any point of the line L, we proceed to the right to a point R and then up or down to the point Q on the line; then slope of image. The length PR is taken as positive, while RQ is taken as positive or negative according as the direction from R to Q is up or down, so that the slope gives the rise or fall per unit length along the horizontal when we proceed along the line from left to right. In Figure 267 the slope of the first line is 2/3 while the slope of the second line is –1.

image

Fig. 267. Slopes of lines.

By the slope of a curve at a point P we mean the slope of the tangent to the curve at P. As long as we accept the tangent of a curve as an intuitively given mathematical concept there remains only the problem of finding a procedure for calculating the slope. For the moment we shall accept this point of view, postponing to the supplement a closer analysis of the problems involved.

2. The Derivative as a Limit

The slope of a curve y = f(x) at the point P (x, y) cannot be calculated by referring to the curve at the point P alone. Instead, one must resort to a limiting process much like that involved in the calculation of the area under a curve. This limiting process is the basis of the differential calculus. We consider on the curve another point P1, near P, with coordinates x1, y1. The straight line joining P to P1 we call t1; it is a secant of the curve, which approximates to the tangent at P when P1 is near P. The angle from the x-axis to t1 we call α1. Now if we let x1 approach x, then P1will move along the curve toward P, and the secant t1 will approach as a limiting position the tangent t to the curve at P. If α denotes the angle from the x-axis to t, then, as x1x,

image

Fig. 268. The derivative as a limit.

y1y, P1P, t1t, and α1α.

The tangent is the limit of the secant, and the slope of the tangent is the limit of the slope of the secant.

Although we have no explicit expression for the slope of the tangent t itself, the slope of the secant t1 is given by the formula

image

or, if we again denote the operation of forming a difference by the symbol Δ,

image

The slope of the secant t1 is a “difference quotient”—the difference Δy of the function values, divided by the difference Δx of the values of the independent variable. Moreover,

slope of t = limit of slope of image,

where the limits are evaluated as x1x, i.e. as Δx = x1x → 0;. The slope of the tangent t to the curve is the limit of the difference quotient Δy/Δx as Δx = x1 – x approaches zero.

The original function f(x) gave the height of the curve y = f(x) for the value x. We may now consider the slope of the curve for a variable point P with the coördinates x and y [= f(x)] as a new function of x which we denote by f′(x) and call the derivative of the function f(x). The limiting process by which it is obtained is called differentiation of f(x). This process is an operation which attaches to a given function r(x) another function f′(x) according to a definite rule, just as the function f(x) is-defined by a rule which attaches to any value of the variable x the value f(x):

f(x) = height of curve y = f(x) at the point x,
f′(x) = slope of curve y = f(x) at the point x.

The word “differentiation” comes from the fact that f′(x) is the limit of the difference f(x1) – f(x) divided by the difference x1x:

(1) image

Another notation, often useful, is

f′(x) = Df(x),

the ”D” simply abbreviating “derivative of”; still different is Leibniz’ notation for the derivative of y = f(x),

image

which we shall discuss in §4 and which indicates the character of the derivative as limit of the difference quotient Δyx or Δf(x)/Δx.

If we describe the curve y = f(x) in the direction of increasing values of x, then a positive derivative, f′(x) > 0, at a point means ascending curve (increasing values of y), a negative derivative, f′(x) < 0, means descending curve, while f′(x) = 0 means a horizontal direction of the curve for the value x. At a maximum or minimum, the slope must be zero (Fig. 269).

image

Fig. 269. The sign of the derivative.

Hence, by solving the equation

f′(x) = 0

for x we may find the positions of the maxima and minima, as was first done by Fermat.

3. Examples

The considerations leading to the definition (1) might seem to be without practical value. One problem has been replaced by another: instead of being asked to find the slope of the tangent to a curve y = f(x) at a point, we are asked to evaluate a limit, (1), which at first sight appears equally difficult. But as soon as we leave the domain of generalities and consider specific functions f(x) we shall obtain tangible results.

The simplest such function is f(x) = c, where c is a constant. The graph of the function y = f(x) = c is a horizontal line coinciding with all its tangents, and it is obvious that

f′(x) = 0

for all values of x. This also follows from the definition (1), for

image

so that, trivially,

image

Next we consider the simple function y = f(x) = x, whose graph is a straight line through the origin bisecting the first quadrant. Geometrically it is clear that

f′(x) = 1

for all values of x, and the analytic definition (1) again yields

image

so that

image

The simplest non-trivial example is the differentiation of the function

y = f(x) = x2,

which amounts to finding the slope of a parabola. This is the simplest case that teaches us how to carry out the passage to the limit when the result is not obvious from the outset. We have

image

If we should try to pass to the limit directly in numerator and denominator we should obtain the meaningless expression 0/0. But we can avoid this impasse by rewriting the difference quotient and cancelling, before passing to the limit, the disturbing factor x1x. (In evaluating the limit of the difference quotient we consider only values x1x, so that this is permissible; see p. 307.) Thus we obtain the expression:

image

Now, after the cancellation, there is no longer any difficulty with the limit as x1x. The limit is obtained “by substitution”; for the new form x1 + x of the difference quotient is continuous and the limit of a continuous function as x1x is simply the value of the function for x1 = x, in our case x + x = 2x, so that

f′(x) = 2x for f(x) = x2.

In a similar way we can prove that for f(x) = x3 we have f′(x) = 3x2. For the difference quotient,

image

can be simplified by the formula image; the denominator Δx = x1x cancels out, and we obtain the continuous expression

image

Now if we let x1 approach x, this expression simply approaches x2 + x2 + x2, and we obtain as limit f′(x) = 3x2.

In general, for

f(x) = xn,

where n is any positive integer, we obtain the derivative

f′(x) = nxn–1.

Exercise: Prove this result. (Use the algebraic formula

image

As a further example of simple devices that permit explicit determination of the derivative we consider the function

image

We have

image

Again we may cancel, and we find image, which is continuous at x1 = x; hence we have in the limit

image

Of course, neither the derivative nor the function itself is defined for x = 0.

Exercises: Prove in a similar manner that for image, image; for image; for f(x) = (1 + x)n, f′(x) = n(1 + x)n-1.

We shall now carry out the differentiation of

image

For the difference quotient we obtain

image

By the formula image we can cancel a factor and get the continuous expression

image

Passing to the limit yields

image

Exercises: Prove that for f(x) image, image; for image; for image.

4. Derivatives of Trigonometrical Functions

We now treat the very important question of the differentiation of trigonometrical functions. Here radian measure of angles will be used exclusively.

To differentiate the function y = f(x) = sin x we set x1x = h, so that x1 = x + h and f(x1) = sin x1 = sin (x + h). By the trigonometrical formula for sin (A + B),

f(x1) = sin (x + h) = sin x cos h + cos x sin h.

Hence

(2) image

If now we let x1 tend to x, then h tends to 0, sin h to 0, and cos h to 1. Moreover, by the results of page 308,

image

and

image

Hence the right side of (2) approaches cos x, giving the result:

The function f(x) = sin x has the derivative f′(x) = cos x, or briefly,

D sin x = cos x.

Exercise: Prove that D cos x = –sin x.

To differentiate the function f(x) = tan x, we write image, and obtain

image

(The last equality follows from the formula sin (A – B) = sin A cos B – cos A sin B, with A = x + h and B = h.) If now we let h approach zero, image approaches 1, cos (x + h) approaches cos x, and we infer:

The derivative of the function f(x) = tan x is image, or

image.

Exercise: Prove that D cot image.

*5. Differentiation and Continuity

The differentiability of a function implies its continuity. For, if the limit of Δyx exists as Δx tends to zero, then it is easy to see that the change Δy of the function f(x) must become arbitrarily small as the difference Δx tends to zero. Hence whenever we can differentiate a function, its continuity is automatically assured; we shall therefore dispense with explicitly mentioning or proving the continuity of the differentiable functions occurring in this chapter unless there is a particular reason for it.

6. Derivative and Velocity. Second Derivative and Acceleration

The preceding discussion of the derivative was carried out in connection with the geometrical concept of the graph of a function. But the significance of the derivative concept is by no means limited to the problem of finding the slope of the tangent to a curve. Even more important in the natural sciences is the problem of calculating the rate of change of some quantity f(t) which varies with the time t. It was from this angle that Newton made his approach to the differential calculus. Newton wished in particular to analyze the phenomenon of velocity, where the time and the position of a moving particle are considered as the variable elements, or, as Newton expressed it, as the “fluent quantities.”

If a particle moves along a straight line, the x-axis, its motion is completely described by giving the position x at any time t as a function x = f(t). A “uniform motion” with constant velocity b along the x-axis is defined by a linear function x = a + bt, where a is the coördinate of the particle at the time t = 0.

In a plane the motion of a particle is described by two functions,

x = f(t), y = g(t),

characterizing the two coordinates as functions of the time. In particular, a uniform motion corresponds to a pair of linear functions,

x = a + bt, y = c + dt,

where b and d are the two “components” of the constant velocity, and a and c the coordinates of the particle at the moment t = 0; the path of the particle is a straight line with the equation (x – a)d – (y – c)b = 0, obtained by eliminating the time t from the two relations above.

If a particle moves in the vertical x, y-plane under the influence of gravity alone, then, as shown in elementary physics, the motion is described by two equations,

x = a + bt, y = c + dt - imagegt2,

where a, b, c, d are constants depending on the initial state of the particle and g the acceleration due to gravity, approximately equal to 32 if time is measured in seconds and distance in feet. The trajectory of the particle, obtained by eliminating t from the two equations, is now a parabola,

image

if b ≠ 0; otherwise it is a part of the vertical axis.

If a particle is confined to move along a given curve in the plane (like a train along a track), then its motion may be described by giving the arc length s, measured along the curve from a fixed initial point P0 to the position P of the particle at the time t, as a function of t; s = f(t). For example, on the unit circle x2 + y2 = 1 the function s = ct describes a uniform rotation with the velocity c along the circle.

Exercises: *Draw the trajectories of the plane motion described by

1) x = sin t, y = cos t. 2) x = sin 2t, y = sin 3t. 3) x = sin 2t, y = 2 sin 3t.

4) In the parabolic motion described above, suppose the particle at the origin for t = 0, and b > 0, d > 0. Find the coördinates of the highest point of the trajectory. Find the time t and the value of x for the second intersection of the trajectory with the x-axis.

Newton’s first aim was to determine the velocity of a non-uniform motion. For simplicity let us consider the motion of a particle along a straight line given by a function x = f(t). If the motion were uniform, with constant velocity, then the velocity could be found by taking two values tand t1 of the time, with corresponding values x = f(t) and x1 = f(t1) of the position, and forming the quotient

image

For example, if t is measured in hours and x in miles, then, for t1t = 1, x1x will be the number of miles covered in 1 hour and v will be the velocity in miles per hour. The statement that the velocity of the motion is constant simply means that the difference quotient

(3) image

is the same for all values of t and t1. But when the motion is not uniform, as in the case of a freely falling body whose velocity increases as it falls, then the quotient (3) does not give the velocity at the instant t, but merely the average velocity during the time interval from t to t1. To obtain the velocity at the exact instant t we must take the limit of the average velocity as t1 approaches t. Thus we define with Newton

(4) image

In other words, the velocity is the derivative of the distance coördinate with respect to the time, or the “instantaneous rate of change” of the distance with respect to the time (as distinguished from the average rate of change given by (3)).

The rate of change of the velocity itself is called the acceleration. It is simply the derivative of the derivative, usually denoted by f″(t), and called the second derivative of f(t).

It was observed by Galileo that for a freely falling body the vertical distance x through which the body falls during the time t is given by the formula

(5) image

where g is the gravitational constant. It follows by differentiating (5) that the velocity ν of the body at the time t is given by

(6) v = f′(t) = gt,

and the acceleration α by

α = f″(t) = g,

which is constant.

Suppose it is required to find the velocity of the body 2 seconds after it has been released. The average velocity during the time interval from t = 2 to t = 2.1 is

image (feet per second).

But substituting t = 2 in (6) we find the instantaneous velocity at the end of two seconds to be v = 64.

Exercise: What is the average velocity of the body during the time interval from t = 2 to t = 2.01? from t = 2 to t = 2.001?

For motion in the plane the-two derivatives f′(t) and g′(t) of the functions x = f(t) and y = g(t) define the components of the velocity. For motion along a fixed curve the velocity will be defined by the derivative of the function s = f(t), where s is the arc length.

7. Geometrical Meaning of the Second Derivative

The second derivative is also important in analysis and geometry, for f″(x), expressing the rate of change of the slope f′(x) of the curve y = f(x), gives an indication of the way the curve is bent. If f″(x) is positive in an interval then the rate of change of f′(x) is positive. A positive rate of change of a function means that the values of the function increase as x increases. Therefore f″(x) > 0 means that the slope f′(x) increases as x increases, so that the curve becomes steeper where it has a positive slope and less steep where it has a negative slope. We say that the curve isconcave upward (Fig. 270).

image

Fig. 270.

image

Fig. 271.

Similarly, if f″(x) < 0, the curve y = f(x) is concave downward (Fig. 271).

The parabola y = f(x) = x2 is concave upward everywhere because f″(x) = 2 is always positive. The curve y = f (x) = x3 is concave upward for x > 0 and concave downward for x < 0 (Fig. 153) because f″(x) = 6x, as the reader can easily prove. Incidentally, for x = 0 we have f′(x) = 3x2 = 0 (but no maximum or minimum!); also f″(x) = 0 for x = 0. This point is called a point of inflection. At such a point the tangent, in this case the x-axis, crosses the curve.

If s denotes the arc-length along the curve, and α the slope-angle, then α = h(s) will be a function of s. As we travel along the curve α = h(s) will change. The rate of change h′(s) is called the curvature of the curve at the point where the arc length is s. We mention without proof that the curvature κ can be expressed in terms of the first and second derivatives of the function y = f (x) defining the curve:

κ = f″(x) / (1 + (f′(x))2)3/2·

8. Maxima and Minima

We can find the maxima and minima of a given function f(x) by first forming f′(x), then finding the values for which this derivative vanishes, and finally investigating which of these values furnish maxima and which minima. The latter question can be decided if we form the second derivative, f″(x), whose sign indicates the convex or concave shape of the graph and whose vanishing usually indicates a point of inflection at which no extremum occurs. By observing the signs of f′(x) and f″(x) we can not only determine the extrema but also find the shape of the graph of the function. This method gives us the values of x for which extrema occur; to find the corresponding values of y = f(x) itself we have to substitute these values of x in f(x).

As an example we consider the polynomial

f(x) = 2x3 – 9x2 + 12x + 1,

and obtain

f′(x) = 6x2 – 18x + 12, f″(x) = 12x – 18.

The roots of the quadratic equation f′(x) = 0 are x1 = 1, x2 = 2, and we have f″(x1) = –6 < 0, f″(x2) = 6 > 0. Hence f(x) has a maximum, f(x1) = 6, and a minimum, f(x2) = 5.

Exercises: 1) Sketch the graph of the function considered above.

2) Discuss and sketch the graph of f(x) = (x2 – 1)(x2 – 4).

3) Find the minimum of x + 1/x, of x + a2/x, of px + q/x, where p and q are positive. Have these functions maxima?

4) Find the maxima and minima of sin x and sin (x2).