Methods of Mathematics Applied to Calculus, Probability, and Statistics (1985)
Part II. THE CALCULUS OF ALGEBRAIC FUNCTIONS
Chapter 9. Nongeometric Applications
9.1 SCALING GEOMETRY
In geometric applications the units on the two axes are the same, both will be lengths, or some other common units. In such a situation a rotation of the coordinate system makes sense. Slope and angles have meaning, as has the distance between two points.
But in many situations the units on the coordinate axes are quite different. Thus you often see the number of dollars plotted against the year, distance versus time, height versus weight, and so on. There is no common unit between dollars and years; you merely pick convenient units along each axis. Furthermore, rotation, Section 6.9, which would involve adding dollars and years, makes no sense (the beginning algebra teacher said, “You can’t add apples to oranges”). Nor do the words “slope,” “angle,” and “distance between two points” mean anything. Effectively, when you have no common unit, you are free to choose the sizes of the units as you please. Only those things can have any real meaning that are invariant (unchanging) under scale transformations of the form (to avoid confusion with derivatives, we are subscripting the new variables rather than using primes).
These equations transform the old (x, y) coordinates to the new coordinates (x1, y2). The particular transformation depends on the constants k1 and k2 and is equivalent to stretching the two axes independently (but each uniformly). Viewed differently, Equations (9.1-1) represent a change in the units of measurement; perhaps the change from x to x1 is from meters to feet.
When the k1 and k2 are greater than 1 in size, then it is a contraction, but we will always use the word stretching. The ki cannot be 0, of course, and we generally do not use negative values (which would correspond to both a stretch and a reflection).
The geometry of this situation is a special case of what is called affine geometry. Only things that are invariant under the above stretching transformation (9.1-1) are of interest in this scaling geometry. This is the geometry appropriate for most of the graphs you see. When k1 = k2, the transformation is merely an enlargement and is of slight interest.
The most general affine transformation allows translations and rotations of the coordinate system; thus it is of the form
If rotations are ruled out (and the axes are kept perpendicular), then c and e are both 0 in the above formulas for the general affine transformation. Often it is convenient to shift the origin of time to start at the first record, but it can be deceptive to subtract a fixed amount of dollars. We will restrict the class of transformations to those that only stretch the coordinate axes.
This will be called a scaling transformation.
Speaking in terms of algebra, rather than geometry, only those expressions are acceptable that transform reasonably when the scaling transformation (9.1-1) is applied. A formula must “scale” properly to be a valid formula. This is very useful to remember and, indeed, is the basis, in a sense, of “dimensional analysis.” Similar variables, such as lengths, will have the same scale factor ki while other variables like mass and time will have their corresponding scale factors. The translation we studied in Section 5.9 is appropriate to a geometry that uses all the same units, but it is rarely applicable to the stretch invariant situation (time is a conspicuous exception since often there is no natural origin of time).
Dimensional analysis is an example of scaling transformations. In dimensional analysis you have one or more variables (and constants) that use the same kt when you change the units of measurement. Given any equation in the variables, it must be that each additive term scales the same (there are at least two terms since it is an equation). If we represent symbolically the length by L, use V for velocity, and T for time, then, looking at the units, we see that symbolically
You can have terms in these units in the form
since velocity is the distance divided by the time. In much of elementary physics the units are mass M, length L, and time T.
Dimensional analysis is a very useful tool that is widely used by experts. But it has simple applications even in geometry. If you have an area, it must depend on the square of the length, and the volume must depend on the cube of the length. For example, for the circle,
circumference C = 2πr and area A = πr2
For a sphere,
It is true that in finding the volume of a box of unit height the answer will appear to depend only on the square of a length, since one length has unit size. This is one of the reasons why it is often wise to do the general case (so you can get a check on the units).
In problems that involve several variables with different units for scaling, dimensional analysis provides even more power in checking that the equations scale properly; each unit must scale properly.
Apply a scaling transformation to dy/dx. We have
where as usual we pass over any constants in the differentiation process.
1.If x = 3x1 and y = 0.5y1, find dy1/dx1 and d2y1/dx21 in terms of dy/dx and d2y/dx2.
2.Find the first derivative in terms of the general affine transformation.
3.Show that one affine transformation followed by another is equivalent to a single affine transformation. Show that for any affine transformation there is one that undoes it (except in degenerate cases), in short that there is generally an inverse transformation.
4.For the total surface of a cone, show that each term scales properly.
5.Show that the derivative with respect to r of the volume of a sphere is the surface, and the derivative of the area of a circle is the circumference. Apply to a cube centered at the origin. Show that the rule that the surface is the derivative of the volume is not generally applicable.
9.2 EQUIVALENT IDEAS
What becomes of the ideas of familiar Euclidean geometry in this scaling geometry? Since slope no longer has an absolute meaning, the interpretation of the derivative must become
Actually, we should say “the limiting rate of change,” but the shorter version is used for convenience. The interpretation of the derivative as a rate of change is an extension of the original interpretation as a slope.
The tangent line is still tangent to the curve after a scaling transformation and is still found the same way as before, but the angle between two lines has no fixed value since a stretch of the coordinates will alter it (unless k1 = k2).
The numerical value of the slope is no longer the same, but the sign remains (provided we use only positive stretch factors ki). Interestingly enough, maxima and minima remain unchanged since the stretch cannot affect the slope of the horizontal line (nor a vertical line). You “see” the truth of these statements, but let us look at the mathematical details for a straight line to see how we might prove them.
If we had the general straight line
Ax + By + C = 0
then after the transformation we would have
Ak1x1 + Bk2y1 + C = 0
which is the slope of the original line (−A/B) with the multiplier
and we see that the above statements are true. This is, of course, the same as we found in Example 9.1-1.
In general, the tangent circle will change its shape under a scaling transformation (circles will go into ellipses), but the sign of the second derivative will not change. Thus curvature no longer has meaning, and we must go back to the use of the second derivative and recall the soup-bowl rule: “concave up” for y″ > 0 and “concave down” for y″ < 0. The test for maximum or minimum remains the same. To see all these things intuitively, merely picture a curve on a sheet of rubber and imagine stretching it uniformly in the x direction. The general features of a curve, or curves, will remain, but the specific values of slope and curvature will alter. Remember (see Figure 9.2-1), a curve that is concave up is convex down, and conversely.
1.Prove that a scaling transformation takes a circle into an ellipse.
2.Show that if one line has + slope and one has − slope then there is a scaling transformation that makes the lines perpendicular. But if they are both of the same sign, then there is a maximum angle that can be achieved. Find it.
3.Find the expression for the new y″ after a scaling transformation in terms of the old y″.
4.*Apply scaling transformations to the analysis of the general conic and show that you need only one circle, one equilateral hyperbola, one size of parabola, and straight lines.
A very common situation in this geometry is to be given the position s (space) as a function of time t. Thus you are given the formula
s = f(t)
For example, you may be given the height in meters of a vertically thrown projectile (rock) as
height = y = 20 + 49t − 4.9t2
where t is measured in seconds and y in meters. In this case it is convenient to use y in place of s since it is the usual vertical coordinate. At time t = 0, it was clearly already 20 meters high; the motion of the projectile began on top of some building. And of course the given formula is to apply only until the projectile hits the ground, y = 0. The curve (with time as the independent variable) is a parabola opening downward (see Figure 9.3-1).
The derivative is
Figure 9.3-1 Projectile height versus time
and is the rate of change of position with respect to time. This is normally called velocity. You are going fast when your position is rapidly changing and slow when it changes slowly. Clearly, at time t = 0, the start, the instantaneous velocity upward was 49 meters per second and is decreasing as time goes on. Negative velocities mean that the projectile is going downward. The maximum height occurs when the velocity is 0, that is, at time (in seconds)
The position corresponding to this time of maximum height is
How shall we interpret the derivative in physical situations? It is distance per unit of time, which is velocity; of course, the derivative is the limiting value of the ratio, it is the instantaneous velocity. If we have two nearby values of position, then their difference divided by the time interval is the average velocity during the time interval, and in the limit we speak of the instantaneous velocity of the projectile. We do not need to get into the argument of how this would be measured in practice; it is sufficient to note that in the physical interpretation, as the interval gets shorter and shorter, we would get more and more accurate estimates of the instantaneous velocity, except for the simple fact that, however accurately we were measuring the position and time, we would ultimately find that we were getting such small differences as to remove all sense of accuracy. The instantaneous velocity of the mathematically defined projectile along a path approximates physical reality in some senses, but not in all. The instantaneous velocity is a mathematical concept that tends to become a physical concept in most people’s minds.
There is always a gap between the mathematics and reality. Most of us believe that the world is made out of molecules, and when you try to make very, very accurate measurements, the random movement of the molecules will defeat your attempts at ultimate precision. In the modem theory of quantum mechanics, it is widely believed that you cannot, even in principle, precisely measure both the position and momentum (velocity times mass) of a particle at the same time; thus in this interpretation of quantum mechanics it is impossible, even theoretically, to get arbitrarily accurate measurements at the same time on certain properties of a particle. In practice, from the physical world we abstract a mathematical idealization of what is going on, and then we operate on the mathematical model. Finally, we try to interpret the mathematical results back into physical reality (see Figure 9.3-2). Surprisingly often we get useful results, but now and then we get nonsense. You need to develop your intuition about the reality of the mathematical models you see.
Figure 9.3-2 Mathematical modeling
What about the physical interpretation of higher derivatives? The rate of change of velocity is normally called acceleration. In the above example we find that the second derivative
meaning a downward acceleration of 9.80 meters per second per second. Each second the velocity changes by 9.80 meters per second. The constant in this example is merely the constant of gravity, which is conveniently taken as 980 centimeters per second per second, or in shorter form
− 980 cm/s2
The minus sign means downward in the chosen coordinate system where the positive y was chosen to be upward.
Newton’s second law of motion asserts that accelerations are caused by forces; indeed the law says that
force = (mass)(acceleration)
(when the mass is a constant). You feel the force due to the acceleration (or deceleration) whenever you speed up (or slow down) in an automobile. Thus, we need to think a bit about forces.
In the late Middle Ages, scientists (called natural philosophers in those days) found that two forces could be combined into one equivalent force. When the forces are in exactly opposite directions, it is easy to see that the resultant force is the difference between the two oppositely directed forces. Conversely, any force can be regarded as the difference between two forces in many different ways (just as any integer can be regarded as the difference between two positive integers in many ways).
More importantly, they found the parallelogram of forces (see Figure 9.4-1), which shows that the two forces along the two edges (represented by lines with arrowheads) are equivalent to a single force along the diagonal at the parallelogram.
Figure 9.4-1 Parallelogram of forces
The converse, that given any force you can break it up into two forces in almost any pair of directions you choose, is a bit more startling.
Locally, there is a constant acceleration of gravity pulling things downward; this acceleration is commonly represented by the letter g = 9.80 m/s2, although for convenience 980 cm/s2 is often used in place of 9.80 m/s2. Actually, the value differs from place to place on the surface of the earth, as well as depending on the height above the surface. This acceleration occurred in the above example of a falling body.
If we imagine firing a cannon horizontally from the top of a mountain (Figure 9.4-2), then we have the horizontal velocity v0 in the horizontal direction. Neglecting air resistance (and the rotation and curvature of the earth), the law of inertia says that in the absence of other forces an object will continue with this velocity indefinitely.
But the force on the mass due to gravity produces a constant acceleration in the downward direction. This gives an increasing downward velocity, which continues until the projectile hits the ground. The x position, starting at x = 0, is given by the formula
x = v0t
while the y position is (h0 is the height of the mountain and − g is the acceleration)
If, instead of firing horizontally, we fired at some angle θ, then we could decompose the initial velocity into two components, one horizontal and the other vertical (see Figure 9.4-3). If the angle of firing is θ, then we have to add the vertical velocity component to the y direction motion. Therefore,
are the equations describing the motion in time. These are called parametric equations, where t is the parameter. For each t there is a pair of numbers x and y. You can plot the x and y numbers as a point in two dimensions and mark small ticks along the trajectory to indicate where the projectile is at a given time. It is clearly a parabola, as you can see if you eliminate the parameter t between the two equations.
This general field is called exterior ballistics (exterior as contrasted with interior ballistics, which is what happens inside the gun barrel; the word ballistics comes from an old weapon, the ballista). Of course, in practice, you must take into consideration the effects of air resistance, which we have been neglecting as it is a messy topic. When we send vehicles into outer space, we have similar problems except that once launched and out of the earth’s atmosphere there is very little air resistance; the world is almost round, so the gravitational force is directed toward the center of the earth (assuming that the earth is homogeneous, which for delicate work it is not); and there are rotational effects due to the rotation of the earth, both at takeoff and on landing. Evidently, simple mathematics will not suffice, but the general ideas will; the problems are much more complex in their details, but the mathematics to solve problems in mechanics is still the same calculus. The simulation of a space flight is a dramatic application of these same mathematical ideas.
We are forced to take simple problems as examples in a first course, but do not be deceived by them; the methods have wide applicability.
1.Find the velocity and acceleration of the motion s = t2 − t.
2.Using (9.4-2) with h0 = 0, find the angle of maximum range.
3.Find the maximum velocity of the motion s = 12 − t4, t ≥ 0.
4.Find the maximum acceleration for the motion s = 20t2 − t4.
9.5 SIMPLE RATE PROBLEMS
Consider the following idealized problem. The water is running at a constant rate out of a conical tank whose shape is given in the Figure 9.5-1. How fast is the surface falling when the depth is 20 units (think of meters or feet if you wish)?
Figure 9.5-1 Conocal tank
This is a word problem; let us analyze how to approach them. First, what is the meaning of the constant rate? It means that the volume is decreasing at a constant rate; if V = V(t) is the volume at any instant t, then letting the rate be R we have
Next we are asked for the rate at which the surface falls. If h is the height, then we are asked to find
at a certain time (when h = 20).
Can we find a relationship between V and h so that after differentiating it implicitly with respect to time, t, we would have some connection with the two derivatives dV/dt and dh/dt? We need to draw a figure (Figure 9.5-2) as soon as we have any idea at all of what is going on—always draw figures when you can!
The volume of the cone of water is
where r is the radius of the surface of the liquid and h is the height. The new letter r is a nuisance to say the least. Can we get rid of it? Yes! By looking at the figure we see the similar triangles, and we write the proportion as
Figure 9.5-2 Schematic figure
Thus we can use r = h/5 and eliminate the r. We have, therefore,
Does this seem to be right? Does the volume depend on the cube of the depth of the liquid? It certainly has the right “dimension” for a volume. If it seems right, then go ahead; and if not, then review things until you are satisfied that the equation is correct.
We are ready to differentiate with respect to time, which will generate the derivatives that were given and asked for. We have
Solve for the requested quantity,
Now we put in the given quantities, dV/dt = R and h = 20, to get
Remember that R is a negative number if the fluid is running out (it would be positive if the tank were being filled up).
There is indeed some method in how to solve word problems! You look at what you are given and what you are asked to find. You then give them names, and try to find relationships between the quantities. If there are rates, then you take derivatives of the general equation with respect to time. After you take the derivatives, you assign the fixed data. Whenever possible, you draw a picture, even if it is only symbolic.
The volume of a spherical soap bubble is increasing at a rate of 3 cm3/s. When the radius is 2 cm, how fast is the surface increasing?
We begin with the two formulas
We have given dV/dt = +3 for all r, and we want dS/dt at r = 2. We could eliminate r from between the two equations, thus getting S in terms of V. There would be a lot of fractional powers to handle. An alternative is to break the problem up into two problems; first, using the volume rate, find the rate dr/dt, and then use dr/dt in the derivative of the surface equation to find dS/dt. This looks easier, so we proceed on that approach.
and we have the rate 3 cm2/s.
Sand is poured on top of a sand pile at a rate R units per second. At one moment we observe the ratio k = height/diameter of the pile (which measures the shape and depends on the “angle of repose” of the sand). How fast is the radius changing when the height is h1? We are deliberately being abstract to give you practice in handling general symbols. If it causes you trouble, you can assign particular numbers, work the problem, and then go over the solution and substitute the abstract values as needed.
First draw Figure 9.5-3. We have that the volume of the cone of sand is
But the diameter d and the height h are connected by
so the volume is
This has the right dimension, so we go ahead. We differentiate to get the needed derivatives:
We must now get to the units that the question asked for. We have r = h/2k. Therefore,
In this equation we set h = h1 and we have the answer.
Figure 9.5-3 Sand pile
Fermat’s Principle. Fermat’s principle in physics asserts that light travels from point A to point B in the minimum time (actually the extremal time). If the velocity of light in the first medium, which contains the point A, is c1, and the velocity in the second medium, which contains the point B, is c2, find the path from A to B. We first draw Figure 9.5-4. Let x be the point where the light ray crosses from the first to the second medium. We set up the formula for the total time:
To find the extreme time, we differentiate with respect to the variable x and note that the conditions at the extreme give
In terms of the angles, this is
Is this a minimum? At x = 0 the derivative dT/dx is negative, while at x = d it is positive. Yes, we have the minimum. Thus it is a simple minimum problem, but cast in the rate form of velocity.
This formula (9.5-1) is known as Snell’s law and was found originally quite independently of Fermat’s principle. Thus you see one principle derived from another, a common thing in a well-developed field like physics.
Figure 9.5-4 Fermat’s principle
1.Do Example 9.5-2 by the first method indicated.
2.How fast does the volume of a cylinder of height h and radius r change when the rate of change of the radius is 3? When the rate of change in the height is 3?
3.A 10-meter ladder is leaning against a wall. If the bottom is pushed in at a rate of 1/10 m/s how fast does the top move when the distance at the bottom is 6 m from the wall?
4.How fast does the surface drop for a hemispherical tank of diameter 10 meters when the volume is decreasing at a rate of 5 m3/h? The formula for the volume of part of a sphere is V = (πh2/3)(3r − h).
5.Two stones are dropped in a well, one following the other. Show that the distance apart increases at a constant rate proportional to the time difference between when they were dropped.
6.Find the critical angle such that the ray of light does not emerge from the denser (slower velocity) medium.
7.If you have a mirror, then c1 = c2. Show from Fermat’s principle that the angle of incidence is the angle of reflection.
9.6 MORE RATE PROBLEMS
A boat moving 20 meters per minute along a path parallel to the end of a pier is being moored as shown in Figure 9.6-1. The path of the boat is 10 meters from the end of the pier. How fast is the rope coming in when the boat is 10 meters from the front of the pier?
Figure 9.6-1 Mooring boat
Draw the figure and assign letters to the various quantities. The length of the rope is the hypotenuse of the triangle, and once we have that we can differentiate and get the rate. So we set out to find the length of the rope as a function of time. One side of the right triangle is 10 m; the other is the distance x from the end of the pier. We have, therefore,
If t = 0 at the same time the boat is opposite the end of the pier, then the rate is to be given at t = −1/2. Now we have a choice. We could differentiate this equation with respect to t as it is, or we could take square roots and differentiate. To avoid the square roots as long as possible, we work with the above form. We have, on differentiating and dividing out the factor 2,
We have dx/dt, which is the velocity of the boat (= −20) and we need L at the moment when (t = −1/2) we are computing the rate of change of the rope length. It is given by
Remember that at that moment we want the rate the value of x = 10 (t = −1/2), and dividing by the L at that moment,
The minus sign means that the distance is decreasing.
For a more complex problem consider a 6-foot-tall man strolling at a rate of 3 ft/s down a path that has a light pole 20 ft high set off from the path by a distance of 5 ft. When the man is 30 ft from the position exactly opposite the pole, how fast is the tip of his shadow moving?
We need pictures to get the problem straight in our mind. A top view of the situation is given in Figure 9.6-2. It shows the path, the man, and the light pole. It is natural to take the origin of the x distance along the path as the point opposite the light pole. Remember that dx/dt = 3 ft/s.
This picture does not show the heights, so we need another picture, one that shows the instantaneous heights, meaning a picture in the vertical plane (which is constantly changing as the man moves along the path) through the pole and the man. We draw Figure 9.6-2, where s is the distance of the man from the pole. Notice that this figure shows that the tip of the shadow moves along a line parallel to the path of the man. Label the total length to the end of the shadow as y (had we been asked how fast the shadow is lengthening, it would be a different matter). We want to know dy/dt.
From Figure 9.6-2, we can get ds/dt as follow.
or, differentiating and dividing by the factor 2 again,
What is the value of t? Oh yes, t is the instant when the man is 30 feet from the pole. That must occur when t = 10 (we are starting time at the moment he is opposite the pole for convenience). What is s? By the right triangle, we have s2 = 925. It is time to go to the other figure.
Somehow we must extract some more information connecting things together. From the similar triangles we pick, after a little thought, the ratio of similar parts as
Now we see a possible approach to the answer. Differentiate this equation with respect to time:
And we have only to substitute using the quantities we have found until we are done (we hope).
Is this the answer? What is y? What is asked for? The quantity y is the distance from the fixed light pole, and the rate of change is the velocity of the tip of the shadow. Yes, we have the answer this time.
Do not get confused with the details of the geometry and lose the simplicity of the calculus. The ability to use the calculus often involves the ability to draw complex situations and extract the essential features you need. This is a difficult art, especially since there is a strong tendency in elementary courses in mathematics to avoid “word problems” as being too hard. But you must master them because that is the form (or even less clearly stated) in which real problems come to you for solution.
If you have trouble with a particular word problem, then try a simpler one (have the man pass directly beneath the light in the above case); it might get you started. Once this is solved, then it is a matter of generalizing the solution to the original problem (the displacement of 5 feet from the light pole). If you cannot do the given problem, can you do one that has a part of it, and then elaborate the solution to cover the given problem? This is a very common attack. To sit and stare at a problem that you cannot understand is a waste of time. To imaginatively strip it down to a simpler problem that you think you can solve is a worthwhile use of your time and effort. It is the method that experts use when attacking a difficult new problem (Hilbert’s principle).
Two ships have paths that cross at right angles. The first ship passes the point where the paths cross at 2 P.M. and goes at a rate of 20 knots (nautical miles per hour). The second ship passes the point at 4 P.M. and goes at a rate of 15 knots. How fast are the ships separating when the time is 6 P.M.? See Figure 9.6-3 for the tracks of the ships.
Figure 9.6-3 Two ships
We begin with the distance z between them, measuring time beginning at 4 P.M. We have
as the distance between the two ships. Differentiation with respect to time gives, at time t,
and at t = 2(6 P.M.) we get (using the original equation to get the needed z and dividing the rate equation by 2)
Case Study 9.6-4
Van der Waals’ equation. It is worth looking at a practical problem that arises in physics. The equation of state of a perfect gas is
where P is the pressure of a perfect gas, V is the volume, T is the absolute temperature, and R is the gas constant. Van der Waals (1837–1923) introduced a modified equation for a more realistic model of a gas where the gas molecules are supposed to have a finite volume. The equation is
where a and b are suitable constants whose values depend on the particular gas being studied. The term involving a comes from the slight increase in the pressure due to the fact that the finite size of the molecules causes them to go slightly less far between collisions, and the term involving b from the slight decrease in the volume available for the motion between collisions. See Figure 9.6-4 for typical curves for different values of the temperature T. Where does the ripple in the lower curves vanish?
This is a typical nongeometric problem, and therefore it is natural to consider scaling transformations of the variables. Set
Using these, you get the original van der Waals’ equation in the new (lowercase) variables p, v, and t:
The stretch factors ci (i = 1, 2, 3) can be chosen for convenience. An examination suggests first picking
because this allows the b to be factored out of the left-hand side of the equation. Next pick C1 (so that it too will factor out). This requires
Figure 9.6-4 Van der Waals equation of state
to remove the last constant. Thus the equation is reduced to the nice form
and you see that there is essentially only one standard van der Waals’ equation to be studied, not whole families depending on the particular gases. You also see that you are interested only in v ≥ 1. This scaling was early recognized and indeed was used to predict the critical temperature and pressure where the phenomenon of the vanishing of the ripple in the curve occurs. This point is called the critical point of the gas and plays an important role in the physics of the gas, especially for low temperatures. How can you find it on the scaled curve?
We begin by noting that for fixed t you are looking for maxima and minima, so you will need to compute the derivative of p with respect to v and set it equal to zero. First solve for p:
Differentiate this and set the result equal to zero to find the equation for the maxima and minima.
or, clearing out the fractions,
What you want to know is for which values of t this cubic in v has three real solutions and for which it has one real solution (see Figure 9.6-4). One of them is v < 1, and this you are not interested in. For v ≥ 1, the left side rises from the value 0 as a quadratic power, and the right-hand side starts at t and rises like a cubic, t being positive. Since for very large values of v the cubic must be larger than the quadratic, either there are two crossings or there are none (for v ≥ 1). The point you are looking for is the biggest t for which the equation has a solution, the maximum t. To find the maximum, differentiate the equation in t and set equal to zero. You have
whose zeros are, after factoring the numerator,
It is the v = 3 that you want, so you have
as the critical temperature at which the two real roots merge into one double root and then become complex. From this you can get the critical pressure,
Notice in how many ways the simple ideas of the calculus were used to get the result. This is typical of many realistic problems; they require a number of small, individually simple steps. The simple steps should be learned so well that you are later able to put them together into larger organizations (chunks) of importance.
For nongeometric equations, the scaling transformation that removes as many of the arbitrary physical constants as possible leads you to the least amount of work. The transformations are the basis for any scaling of the problem that may be there. Sometimes physical reasons arising from the problem suggest one way of removing the constants rather than another. Not to use the scaling that is available from the scaling transformations is usually foolish, since scaling often leads you to the heart of the problem and away from the details of the particular case as it is first posed to you.
1.There are two lights above a path, the first of height h1 of intensity I1 and the second of height h2 and of intensity I2. They are c units apart. If the light intensity falls off as the square of the distance, give the formula for the rate of change of the illumination if the measuring instrument moves at a rate of d units per second.
2.If a car going 60 kilometers per hour has a plane going 300 km/h fly directly over it on a bearing of 45° to the one the car is driving, how fast do they separate? (It is implied by the 45° that they are going in the same general direction.)
3.Heat radiates inversely as the square of the distance away. If two equally intense sources of heat are c units apart, how fast does the intensity change on a meter moving 5 m/min from one to the other when it is in the middle between them?
4.A light post has a light on top 6 ft high. A 4-ft child walks at a rate of 12 ft/min past it on a path 7 ft from the post and parallel to a house wall 3 ft on the other side of the child. How fast does the shadow move when the child is opposite the pole?
5.How fast does the circumference of a circle change when the area of a circle is changing at a rate R?
6.If the surface of a sphere changes at a rate of R square units per second, how fast does the volume change for a given radius?
9.7 NEWTON’S METHOD FOR FINDING ZEROS
A very common problem is to find the zeros of a function. We have carefully arranged the earlier problems so that the zeros for maximum, minimum, and inflection points are easily found. But in practice, situations often arise where this is not so. Newton devised a simple method of finding the real zeros of a function arbitrarily accurately.
The basic idea of Newton’s method for finding zeros is analytic substitution of the tangent line for the function (see Figure 9.7-1). The argument is that the corresponding zero of the tangent line (an approximation to the curve, Section 8.4, end) would give a better local approximation to the zero than the approximation you started with. Then, using this improved approximation, we use the new tangent line and find a still better approximation. We need to translate this iterative process into a systematic method.
Figure 9.7-1 Newton’s method
y = y(x)
with some first approximation to the zero, call it x0, we fit the tangent line to the curve at the point. The tangent line is, from the point-slope formula,
The zero of this line occurs when y = 0. We solve for the corresponding x value, which we label x1.
where, of course, the derivative is evaluated at x = x0.
In view of our plan to iterate the formula to get a sequence of better and better estimates of the zero, it is wise to write Newton’s formula in the form
In words, the next estimate of the root is the present one minus the function value divided by the derivative at that point.
Let us apply this method first to a case whose solution you know. We try the function
one of whose solutions is . Suppose you start with the guess x = 1. Then you have
The next guess is (Figure 9.7-1)
The next guess
It is a matter of iterating the process until you get the accuracy you want. At each step you will almost double the number of digits that are correct.
When finding the square root of a number N, Newton’s method can be arranged in a particularly neat form. The equation y = x2 − N has a zero, which is . Therefore, using (9.7-1),
In this form you see another view of why the method works; if the guess xn is too small, then the quotient N/xn is too large, and the average of the two numbers is a good new guess. Similarly, if the original guess is too large, then the quotient is too small, and again the average of the two numbers is a good guess.
Suppose you want to find the solution of
Set up the function
whose zero is the number you want. Differentiate to get
Thinking of the two graphs, y1 = x and (Figure 9.7-2) (remember you are looking for their intersection), you probably would guess at x = 1. Then
Figure 9.7-2 Finding a zero
y = 1 − 1/ and dy/dx = 1 + 1/(2). The next estimate is, therefore, 0.7836 1162. Indeed, you easily get the table.
If, instead of starting with x = 1, you started with the guess of x = 0, then the next guess is x = 1, and you then get the above table. Finally, if you begin with you get
with no further changes.
There is no unique function to use to find the zero of a given equation. In Example 9.7-2, we could have used
Again, since x is > 0, we could have squared the equation before trying to solve it; we could choose to use
but would have to watch for the root that is introduced by the squaring and later eliminate it.
These examples show that Newton’s method usually converges rapidly. However, there are circumstances in which the method gives trouble. Figure 9.7-3 shows three such situations. The first arises when there is an inflection point between the first guess and the answer. The second shows how a local minimum can cause trouble. The third shows the slow approach when it is a multiple zero.
Let us review the method. The idea is that, given a first guess, we fit a local tangent line at that point and use the zero of the line as the next guess. Then we iterate the process, each stage getting approximately twice the number of digits correct as we have at the start of the step. Insofar as the tangent line represents the curve locally, the method is effective; but if there are serious differences between the curve and its tangent line, or if y′ approaches zero, there can be trouble.
Figure 9.7-3 Failures of Newton’s method
1.Find the cube root of 3.
2.Find the general formula for finding the cube root of N.
3.Apply Newton’s method to the suggested forms after Example 9.7-2.
4.Generalize Exercise 2 to the nth root of N.
5.Find the real root of x2 = 1/(1 + x).
6.Discuss the problem of finding all the real roots of a polynomial using Newton’s method.
7.*Compare the bisection method, Exercise 3.5-9, with Newton’s method. List the advan tages and disadvantages of both.
8.*Extend Newton’s method and fit a quadratic locally. Note there will be the question of which root of the quadratic to use.
9.8 MULTIPLE ZEROS
The factor theorem that we derived in Section 2.7 shows the equivalence of a factor of a polynomial and a zero of it; if you have one, you have the other. But sometimes a polynomial may have the same factor repeated, for example,
P(x) = x3 − 5x2 + 7x − 3 = (x − 3)(x − 1)2
Abusing the language, we say that the equation has a multiple zero, that the root x = 1 is a double root. We wish to preserve the equivalence of zeros and factors, and are therefore forced to this kind of talk. Similar remarks apply for higher multiplicities of roots.
The finding of multiple roots plays an important role in much of mathematics. We see that multiple roots can cause trouble for Newton’s method for finding zeros because the derivative, which appears in the denominator will also vanish for a multiple zero. For example, in the above equation
This shows that the first derivative also vanishes at the same point, x = 1. Indeed, if there is a repeated factor of order k,
where pn-k(x) is all the rest of the factors, then the first derivative will have the form
and shows that the first derivative has a zero at the point x = a of exactly order k − 1. By induction, we see that a factor of order k at x = a means that the function, the first, second, …,(k − 1)th derivative will all be zero at x = a. It also follows by some thinking that the next derivative will not vanish at that point.
This suggests that Euclid’s algorithm for common factors of a polynomial, Section 4.6, can be applied to a polynomial and its first derivative. The greatest common factor that the algorithm produces has all the repeated factors, each to one lower power than it was in the original polynomial. Note that Euclid’s algorithm is a rational process (involving only the rational operations of add, subtract, multiply, and divide), so finding the polynomial containing all the repeated factors (to their appropriate degrees) of a given polynomial is a simple, although at times tedious, process.
And, of course, we can apply the same step to the greatest common factor to find the repeated factors of it. And so on. Thus, if a polynomial has a single repeated factor of highest degree, then we can find it by a rational process. If there are two different factors both of the same highest degree, we can find the corresponding quadratic equation. Thus, in a very real sense, for a polynomial of degree n, multiple roots are easier to find than many different roots of first degree.
Find the multiple factors of
The derivative is
The coefficient 3 can be ignored, which makes the arithmetic easier to carry out. We divide the function by the derivative (using detached coefficients, and remembering to supply zero coefficients for the missing terms, just as you do when there are missing powers of 10 in a number such as four hundred and three, which is written as 403):
The remainder is − 2x − 4, but can be conveniently taken as x + 2. We therefore divide the derivative by this
Hence the common factor is x + 2. It must occur doubly in the original function. We have only to divide this factor out of the given polynomial twice to get the remaining factor, x − 1. We have, therefore,
as the completely factored form.
1.Find the zeros of x4 − 5x3 4 − 9x2 − 7x + 2.
2.Find the zeros of x6 − 8x5 + 24x4 − 34x3 + 23x − 6.
9.9 THE SUMMATION NOTATION
We make a digression here to develop a useful notation. We want to write the sum of a number of similar terms and do not want to write them out every time (even when using the ellipsis method of three dots). The Greek capital sigma is normally used,
This is the sum of all the terms of the form indicated, beginning at k = 1 and going through k = n. In general, the notation (a and b are integers and a < b)
means the sum
Note that both ends of the sum are included; there are (b − a + 1) terms in the sum.
To get no terms in the sum, you write (extending the notation in a somewhat unnatural way)
Another example that sometimes causes the student trouble is
It is easy to see that
since the constant c will factor out of each term in the sum. It is also easy to see that
and therefore Σ is a linear operator.
Using this notation, the sums that occurred earlier in Section 2.2 can be written
Similarly, the binomial coefficients can be written in the form
while the sum of a geometric progression for n terms takes the form
It is easy to see that
and therefore the dummy index of summation has no real meaning beyond its immediate equation. It is often very convenient to change the index of summation in the middle of a derivation or other work, so the above fact should be clearly understood; the letter used as the index of summation is a dummy quantity and has only local significance.
Finally, it is often convenient to drop the first appearance of the dummy index and write
and we write at times, when the values of a and b are clearly evident,
9.10 GENERATING IDENTITIES
An application of differentiation that seems to have very little to do with either the slope or the rate of change is the use of differentiation to generate new identities from old identities.
Section 2.4 developed some of the theory of the binomial coefficients. In particular, the function (we have shifted from the earlier letter x to the letter t)
was used to generate the binomial coefficients. Thus the function
is called the generating function of the binomial coefficients. The letter t is usually used as the variable in the generating function, but when convenient other letters, such as x, are used.
From this you can get other identities by various operations, such as setting t = 1. The identity (9.10-1) becomes
You can also differentiate a generating function with respect to t to get new identities. Using identity (9.10-1), you get
When you put t = 1, you get
This is a useful identity involving the binomial coefficients.
You can get further identities by further differentiation, and you can also, for example, multiply by t before doing the next differentiation. What is required is the imagination to see what to do to generate the identity you want.
Find the sum
We remember, after some thought, a similar identity for the geometric progression (assuming that |x| < 1):
If we differentiate formally (without regard to whether an infinite sum can be differentiated term by term or not) with respect to x, we get
which is what is required.
If you are a bit worried about the differentiation of the infinite sum, then you can proceed as follows. Take the finite sum
and differentiate it (which is a finite sum and you know that both sides will give the same result):
We now see that for fixed |x| < 1 and increasing n, the second term on the left approaches zero; hence the series approaches the first term, which is what we had above (9.10-4). Hence this time we get the same answer in both cases.
The purpose of this example is to show the power of the generating function approach. Consider the representation of
where we are supposing that x is so small that the right-hand side “converges” to a definite value for the value of x. Let us formally differentiate both sides:
Now multiply both sides by −2(1 − x). We get
Since we proved that the powers of x are linearly independent (Section 4.7) for any finite number of terms, it is reasonable to suspect that the independence also holds for infinite sums. We are led, therefore, to equate the separate powers of x:
Now, if x = 0, then clearly c0 = ±1. Taking the + for the square root, we can solve for the successive coefficients of the expansion, one at a time.
If we had taken the minus sign for the square root, then each of the coefficients would change sign; hence
The rule for computing the binomial coefficients given in Section 2.4 still applies for all fractional exponents. By Section 4.5, we assume that it will also apply for irrational numbers.
Thus the differential calculus has many uses beyond the obvious simple max-min problems and rate problems. Differentiation can be used as a formal process for obtaining new results from old, known results.
1.From Example 9.10-1 multiply through by t, differentiate again, and so on, to get .
2.Find . Hint: See Example 9.10-2 and repeat the trick of the previous exercise.
3.Find Σ k/2k and Σ k2/2k. Generalize.
4.Use the generating function 1/(1 − xt) to get the result (9.10-2).
5.Find the .
6.Find the .
7.Find the .
9.11 GENERATING FUNCTIONS—PLACE HOLDERS
The main purpose of this section is to explore the meaning of the dummy variable t that we used in the generating function, such as that for the binomial coefficients
We first look at a similar, familiar situation, plain old multiplication. Consider
In a slightly less familiar form, we can begin the multiplication by the first digit on the left, and arrange the products in the other order
What is the role of the 0? It is a place holder to keep the lineup of the powers of 10, which is implicit in the whole arrangement of the multiplication; we keep the same powers of 10 in the same column.
Now suppose we consider the product
We do not write the powers of t (just as we ignore the powers of 10 in arithmetic). We have
which we immediately recognize as the binomial coefficients of order 6, as they should be. Notice how closely this resembles multiplication in arithmetic except that there are no carries from one column to another.
We immediately generalize to the nth case:
Let us pick out the middle column of the product, the coefficient of tn. This is the sum of all the terms whose exponent of t in the product is n. From the top line, we get C(n, n), C(n, 0); from the next line C(n, n − 1) C(n, 1); from the next, C(n, n − 2) C(n, 2); and so on down to the last line, C(n, 0) C(n, n). The sum of all these terms is the corresponding coefficient of tn from the expansion of order 2n, that is, C(2n, n). In the summation notation, we have
Recalling from Section 2.4 that the binomial coefficients are symmetric in the second index,
we can write the above equation as
In words, the sum of the squares of the binomial coefficients of order n is the single midcoefficient of the expansion of order In.
In all this derivation, what is the role of the dummy letter ft It is a place holder enabling us to keep track of the various terms; it does not play the role of a number. True, the expression
indicates that the quantity represented by t must have some properties of a number (we cannot add apples to oranges), yet it really is not a number, nor does it necessarily become one later in the above derivation. It is some kind of a generalized number having no particular size. Sometimes we do assign a numerical value to it, but often we do not.
1.Find the Σ (−1)kC2(n, k).
2.Find the sum of the binomial coefficients in the mth column.
3.Generalize Example 9.11-1 to the case of (1 + t)n(1 + t)m.
When the student has done enough four-step derivations, the terms that are going to matter are soon recognized, and the others that will drop out are ignored. For example, suppose you have
At the third step of the four-step process, you get something like
You simply are not interested in the terms represented by the three dots; you know that they will disappear in the limit as Δx approaches zero.
We would like to eliminate early in the derivation all the terms that will not matter. In Figure 9.12-1, you see what the derivation looks like. P is the original point (x, y), and Q is the incremented point. The point S represents the position of the point when all the terms that you want to neglect are omitted; you are on the tangent line. We cannot call this change in the y direction Δy any more because of the terms that
we dropped. A psychologically good, but logically bad, choice is conventionally made to call the quantities
Since the point S is on the tangent line, we have
where the symbol on the right is the conventional derivative. Thus on the left we think of the dx and dy as small changes running along the tangent line, while on the right the derivative is evaluated at the point on the curve. After having insisted that the symbol dy/dx is a single symbol, we now write it as a quotient of two differentials dy and dx.
What are these differentials dy and dx? They are symbols that are chosen to keep track of the terms that matter and to ignore those that do not; they are really place holders, much as the t was in the generating function method. They are not numbers per se (per se means by, of, or in themselves), but they have some of the attributes of numbers since we find them being added to numbers. They are not of any definite size, since they represent the terms that will still be present after the limit is taken.
The users of the calculus constantly think in terms of differentials when they set up problems from the real world. They say they think of the differentials as being “infinitely small,” except that they generally do not because they also believe in the discrete nature of the physical universe to which they are applying the argument. They are trying to get rid of the terms that will not matter when they take the limit. There is a logical difficulty in the way they talk about things, but when you understand that the differentials are being used to eliminate the terms that will not matter and to concentrate on the terms that will, then what is going on becomes clear. Of course, it is implied that the person doing the derivation knows what terms will matter and those that will not. If an error is made and a needed term is omitted, then the result will, of course, be false. But experience is a very good guide, and the use of differentials saves a lot of very tedious details in the usual derivation; differentials eliminate early those terms that will vanish in the limit.
9.13 DIFFERENTIALS ARE SMALL
Having insisted that differentials have no size, that they are merely place holders for the terms that will matter in taking the limit, we now turn around (again!) and let them be small and thereby use the tangent line as an approximation to the curve. There is in all this no definition of how small “small” really is. There is again an intuitive feeling that if we use (small) differentials then the terms we neglect will (near the limit) be very small; hence we will make relatively small errors when we use differentials.
Suppose we have a rectangular shape with an area A:
and suppose that x = 5 and y = 7. The area A = 35. But next we suppose that the x and y values are slightly changed (Figure 9.13-1). Let the changes be dx = 0.01 and dy = 0.02. Then the change in the area is
Writing this in differential form, we have
In this equation, supposing that the differentials are small, the last term is very much smaller than the others in the equation, and we simply ignore it to get
and the new area is
In this approximation we see from the figure that we have taken the contributions from the two sides (rectangles) and neglected the cross-product term dx dy = 0.01 × 0.02 = 0.0002, which in this case is a very small error indeed when compared to the actual area. The formula we used is simply the derivative of a product formula in the differential form.
There is no reason for the changes to be positive; they can be either positive or negative as needed. What is required is that you realize that the use of differentials puts you on the local tangent line and not on the curve itself.
Suppose you have a circle of radius r and wish to make a small change in the radius. How much does the area change? We simply write
and we have the formula telling us (approximately for small changes) the change in area for a change in the radius.
Let us write the old derivative formulas in the new differential form. We have
These forms are only a small change from the earlier forms, where we used derivatives instead of differentials, but they are the form we will often need in the future. Hence these formulas should be learned for immediate recall at any moment.
If you want to use the idea of differentials, and not be bound to very small quantities, then a glance at Newton’s method (Section 9.7) for finding zeros of a function shows that large changes can sometimes be accommodated by the application of an iteration process.
Differentiate the following in differential form:
1.y = 3x4
3.y = (1 − x)/(1 + x)
5.x2 + y2 = a2
6.3xy2 + 3x2y = b3
8.(x2 + 4)(x2 − 6)2 = y
9.y = (1 + x2)4
In this chapter we saw that there is an essential difference between geometric and nongeometric applications of the calculus, but that there are some equivalences between them. In particular, we saw the central role of scaling transformations in nongeometric applications.
We also saw applications to velocity and acceleration in simple and slightly complex problems, usually called rate problems because they involve rates (derivatives with respect to time).
Newton’s method for finding the real zeros of a function is based on the idea of the analytic substitution of the local tangent line for the original curve, and then taking the zero of the tangent line as the new approximation to the zero of the original function. When it works, its convergence, once well started, is very rapid.
Next, we saw that the calculus is also useful in generating new identities from old ones. We can take any identity in x (or t), multiply by an arbitrary function of x, and differentiate to get a new identity. The problem is not so much that of generating identities as it is of finding how to generate the particular one you want to be. This takes a little imagination to decide where to start and how to get to where you want. This use of the calculus rests on onlythe simple fact that differentiation is a formal linear mapping of functions onto functions, and seems to have very little to do with the original concept of the limit of the ratio of y/x. This is a typical generalization of an idea in mathematics; something begun for one purpose is later found to have many, apparently unrelated, applications. It is an example of the universality of mathematical ideas (Section 1.3).
Finally, we looked at the delicate topic of differentials as a means of handling a sea of unnecessary details by simply eliminating them from the derivation. The concept was extended to small, local approximations along the tangent line, rather than on the curve; and insofar as the tangent line is close to the curve, their use in this manner is reasonable. The logical difficulties remain, however. The differential form is used in later parts of this book, so the rules must be learned carefully.