Forging the Anti-Hammers - Two in One - Burn Math Class: And Reinvent Mathematics for Yourself (2016)

Burn Math Class: And Reinvent Mathematics for Yourself (2016)

Act III

6. Two in One

6.3. Forging the Anti-Hammers

What should we do next to build our understanding of this new idea? If we wanted to, we could just continue trying to anti-differentiate a bunch of specific machines, essentially by guessing, which is what we’ve been doing so far. However, in the past when we were inventing the idea of the infinite magnifying glass and playing with the concept of derivatives for the first time, we got the most bang for our buck not from differentiating specific machines but by building hammers that work for any machine. I’ll go back to Chapter 3 for a moment and steal one of our boxes where we described all of our hammers. Wait here. I promise I won’t take long.

(Time passes.)

Okay, I’m back. Here it is:

Hammer for Addition

(f + g)′ = f′ + g

Hammer for Multiplication

(fg)′ = fg + fg

Hammer for Reabbreviation

Since it appears that the thing is kind of the opposite of the derivative, it would be super nice if there were three analogous “anti-hammers,” one for each of the originals! Then we’d be able to do all kinds of things with curvy areas. Let’s see if we can forge some anti-hammers.

6.3.1Anti-Hammering Addition

Earlier, we discovered a nice thing about derivatives, called the “hammer for addition.” Essentially it said that “the derivative of a sum is the sum of the derivatives,” or to put it another way: (f + g)′ = f′ + g′. It would be nice if a similar fact were true for this new “integration” idea we’ve come up with, because then we’d have a tool that would let us shatter certain hard problems into easier pieces, just like the original hammers allowed us to do. Since the derivative of a sum is the sum of the derivatives, we might guess that the integral of a sum is the sum of the integrals.1 That is, we want to see whether it is true that

Recall that the “integral” of m is what textbooks call . Actually, they tend to call this a “definite integral” of m. The term “indefinite integral” is often used to refer to an anti-derivative of m, but this terminology can be slightly misleading, in that it only makes sense after the discovery of the fundamental hammer. Before the discovery of the fundamental hammer, it’s not obvious that anti-derivatives have anything to do with “integrals,” (i.e., possibly curvy areas). We’ll prefer to simply call the integral of m from a to b. Only after we’ve defined this terminology does the fundamental hammer tell us that integrals and anti-derivatives are related concepts.

We don’t know if this is true, but it would be nice, in that it would be the opposite of the hammer for addition. It would be an “anti-hammer for addition,” if you will. Let’s try to see if it’s true.

To begin, notice that we’re welcome to think of f(x) + g(x) as a single machine in its own right, and we could even give it a name, like h(x) ≡ f(x) + g(x). Then h(x) would be the machine that spits out f(x) plus g(x) whenever we feed it a specific number x. Imagine looking at the graph of h(x), which might be some crazy squiggly thing, and imagine putting an infinitely thin rectangle at an arbitrary point x, stretching vertically from the horizontal axis up (or down) to the graph of h (see Figures 6.4 and 6.5).

The width of this rectangle will be dx, and its height will be h(x) ≡ f(x) + g(x), so its area will be f(x) + g(x)dx. But then by the obvious law of tearing things, this tall thin rectangle can be torn into two smaller thin rectangles to give f(x)dx + g(x)dx. Now we have two rectangles, one with heightf(x), and one with height g(x). So we can think of an infinitely small rectangle at each point x in two ways, as a tall thin rectangle or as two shorter but equally thin rectangles.

Intuitively, it seems clear that we can then get the whole area under h in two ways: we could add up all the tall thin rectangles to get , or we could add up all the torn thin rectangles to get . These are two descriptions of the same thing: the total area under f + g. So we can slap an equals sign between the two descriptions and write

Figure 6.4: Two machines, chosen randomly. On the left, the tiny rectangle has area f(4)dx. On the right, the tiny rectangle has area g(4)dx.

That’s exactly what we wanted to show. Let’s write it in a box to make it official.

Anti-Hammer for Addition

We’ve discovered another fact about our new idea,

The ∫ of a sum is the sum of the ∫’s.

To put it another way: for any machines f and g, we have

Now that we’ve invented the anti-hammer for addition, Figures 6.4 and 6.5 let us visualize what it is saying in another way. The letters f and g stand for any two machines, possibly with very curvy graphs. Figure 6.4 lets us picture the individual machines f and g, as well as the area under them.Figure 6.5 lets us picture h(x) ≡ f(x)+g(x), and gives us another way of picturing what the anti-hammer for addition is saying.

6.3.2Anti-Hammering Multiplication

The Simple Anti-Hammer for Multiplication

Figure 6.5: This is the graph of f(x) + g(x). The tiny rectangle has an area of (f(4)+g(4))dx = f(4)dx+g(4)dx. That is, the area of this tiny rectangle is the sum of those in the left and right side of Figure 6.4. Since this fact is true for each point x, adding up the areas under each such point does not change the principle.

We actually had two hammers for multiplication, although one of them was a special case of the other. Recall that in Chapter 2 we showed (#f(x))′ = # f′(x), where # is some fixed number like 7 or 59. So we can “pull constants out of derivatives.” I wonder if we can “pull constants out of ,” so that #(stuff) = # (stuff). The idea makes sense intuitively, if we think about what the two expressions mean. Remember that in an expression like f(x) dx, we’re adding up the areas of a bunch of infinitely small rectangles. If we double the height of each rectangle, keeping its (infinitely thin) width the same, then we should have twice of the original area, so the area should double. There’s nothing special about the word “double” in this argument, and it should be true for any amount of magnification #. That is, it should be true that

for any number # and any machine f. Hooray! Again, we’ve found an “anti-hammer” for integration corresponding to one of our old hammers for differentiation.

The Less Simple Anti-Hammer for Multiplication

Above, we found that we can pull constants outside of integrals, much like we could pull constants outside of derivatives. However, the real hammer for multiplication was a bit more complicated than that. It said

(fg)′ = f′g + fg

or, to say the exact same sentence with different abbreviations,

[f(x)g(x)]′ = f′(x)g(x)+f(x)g′(x)

Let’s try to use this fact to build an analogous anti-hammer for multiplication. Well, if we “integrate” both sides of the above equation — that is, if we feed both sides to the thing — then we would get

The left side of the above equation is an integral of a derivative, so we can smack it with the fundamental hammer to get

The left side is a bit ugly, but it’s a simple idea, so let’s abbreviate it. We’ll rewrite the above equation this way:

where the left side is shorthand for “plug in b everywhere, then plug in a everywhere, then take the difference,” so is an just abbreviation for f(b)g(b) − f(a)g(a).

Now, although equation 6.10 is true, it’s not particularly helpful, because we can only use it when we happen to encounter things that look exactly like , and not many things look like that. If we do happen to encounter something that looks like that, then we can instantly get a number out and be done. As such, we could simply stop here and call this the anti-hammer for multiplication, since it “undoes” the hammer for multiplication. However, we can get a much more useful anti-hammer simply by thinking of the above idea in a slightly different way. Let’s rewrite equation 6.10 by using the anti-hammer for addition to break up the big integral into two pieces, like this,

and throwing one of the two integrals over to the other side of the equation. It may not be clear at first why we would want to do that, but we’ll discuss why in a few seconds. For now, let’s write our invention in a box, to make it official.

Anti-Hammer for Multiplication

We’ve discovered another fact about our new ∫ idea, though we don’t know how to use it yet:

How can we use this crazy sentence? Well, remember that the hammer for multiplication (and all the other hammers) were not rules that said what we had to do, but rather tools that let us make progress by choosing to interpret things in a specific way. For example, consider the machine m(x) ≡ xex. We don’t have to think of this machine as two different machines multiplied together, but we are free to think of it that way if it helps us. We could choose to think of xex as f(x)g(x), where f(x) ≡ x and g(x) ≡ ex. Then the hammer for multiplication (abbreviated HM in the following equations) would tell us that

All our original hammers carried this same “only if it helps us” interpretation, and so do the anti-hammers. But after all that hammer yammering, why is equation 6.12 in the box above “more useful” than equation 6.9, even though they’re exactly the same sentence wearing slightly different hats? Good question! Equation 6.12 tends to be more useful than equation 6.9 not because they’re saying different things but because of the limits of the human imagination. To see what I mean, imagine we’re stuck trying to compute something that looks like . Now, whenever we’re stuck trying to compute something like this, it tends to be easier for most people’s minds (including mine) to dream up two machines f and g for which m can be interpreted like this

than it is to dream up two machines f and g for which m can be interpreted like this

This is an important point that we’ve encountered several times before: two methods, ideas, equations, etc., may be logically identical, but that certainly doesn’t mean that they’re psychologically identical. That is, two ways of saying exactly the same thing may be very different in terms of how easy they are to understand. The anti-hammer for multiplication gives us a way to translate one problem into another. Will the translated problem always be easier for us? Well, no. But it might, if we choose a clever translation. Here’s an example of how we can translate a problem. Suppose we want to calculate this:

Because of the fundamental hammer, if only we could magically think of some machine M(x) whose derivative was xex, then we could say, “Aha! The answer is M(1) − M(0).” As easy as that might sound, it’s not! It is far from clear how to think of a machine whose derivative just happens to bexex. So it would appear that we’re stuck. However, the anti-hammer for multiplication suggests a path forward. If we can dream up two machines f and g for which f′(x)g(x) = xex, then this new anti-hammer lets us transform the problem into a slightly different one. Let’s start by choosing f′(x) ≡ exand g(x) ≡ x, and we’ll write AHM above an equals sign when we’re using the anti-hammer for multiplication. Then we could rewrite the problem like this:

So, we decided on f′ and g ourselves, but now this anti-hammer has spat out a sentence involving f and g′. We don’t know those yet, so we need to figure them out to make sense of that sentence. Getting f is pretty simple. We defined f′(x) ≡ ex, and this is the extremely special machine that is its own derivative, so it’s a perfectly good anti-derivative for itself: f(x) = ex. What about g′(x)? That’s simple too. We defined g(x) ≡ x, so g′(x) = 1. Now we can throw all this information into the above equation to get

The first piece is just an abbreviation for 1e1 − 0e0, or e. What about the second piece? Well, we know that ex is an anti-derivative of ex, so by the fundamental hammer, we have . Putting it all together, we have

Nice! By transforming a problem we couldn’t solve into an equivalent but different-looking problem using the anti-hammer for multiplication, we were able to solve the problem easily, and we found that the answer was simply 1. Things didn’t have to work out so nicely, though. What if we had made an equally correct but slightly different choice for f and g? Let’s see. We could just as well have chosen f′(x) ≡ x and g(x) ≡ ex. That would have led us to rewrite the problem like this:

This is even scarier-looking than what we started with. It’s important to emphasize, however, that we didn’t do anything wrong. The above answer is completely correct. We simply made a choice of f′ and g that didn’t make the problem look simpler to us. As we discussed above, this is a general principle about the hammers and anti-hammers. We’re always free to use them, but there’s no guarantee that they’ll transform the problem into something we think is “simpler.” That’s not their fault. It’s a constraint of the human imagination.

6.3.3Anti-Hammering Reabbreviation

Let’s first remind ourselves about the original hammer for reabbreviation, and how it was used. The hammer for reabbreviation (called the “chain rule” in textbooks) was a helpful tool that we invented by lying and correcting for the lie. For example, suppose we were stuck trying to figure out the derivative of the scary-looking machine m(x) ≡ [V(x)]795, thinking of x as the variable.2 We need to calculate . As we’ve seen before, reabbreviation helps. First, notice that [V(x)]795 is just some stuff to a power, so using the abbreviation s ≡ V(x), we can write m(x) ≡ s795. Now that we’ve chosen this abbreviation, the hammer for reabbreviation helps us in the following way. We can write:

Following the convention we’ve been using since Chapter 4, the abbreviation V(x) stands for what the textbooks call sin(x).

Okay, we’ve refreshed our memory about the hammer for reabbreviation, but we still haven’t invented the corresponding anti-hammer. Here’s the idea. As before, we always have the right to reabbreviate, but we’re not guaranteed that it will help. Let’s look at a specific example.

Reabbreviation Play

Imagine we’ve got a problem that looks like this, and we’re stuck:

Because of the fundamental hammer, if only we could magically think of some machine M(x) whose derivative was xex2, then we would know that the above confusing bag of symbols is equal to M(b) − M(a). It’s not at all clear how to think of such a machine, but we’re free to play around by reabbreviating things. Here’s one strategy: we’ve never dealt with ex2 before, but we have dealt with ex. However, ex2 is just es, where s ≡ x2. So we can write:

We haven’t really done anything. We haven’t even lied. We just reabbreviated. However, we’ve got two letters floating around, s and x. This isn’t illegal, but it makes things a bit more confusing, and it’s not clear whether or not we can use the fundamental hammer, because the fundamental hammer involved only one variable. So maybe if we get rid of all the x’s and talk about the entire problem in s language, life might be a bit easier. Well, if s ≡ x2, then x = s1/2, so we might be tempted to write:

That’s perfectly correct, but it’s scary-looking, so forget that, and instead let’s go back to the less scary equation 6.14 and stare at it for a moment. We want to turn the dx piece into something in s language, so it might help to try to relate dx and ds to each other. Derivatives do that, so it might help to compute a derivative. Maybe not, but let’s try. Since s ≡ x2, we have

By a huge stroke of luck, everything collapses into simplicity when we substitute this into equation 6.14. The x pieces kill each other, and we get

But wait — the a and b were secretly abbreviations for x = a and x = b. We have to remind ourselves of that so that we don’t confuse the sentence x = a with the sentence s = a. Let’s just remind ourselves of that by writing

Okay, since es is its own derivative (thinking of s as the variable) it’s also its own anti-derivative, so we can use the fundamental hammer and write

And we’re done. We just showed that

This may not look like a very simple answer, but it tells us all sorts of crazy things that are far from obvious. For example, when a = 0 and b = 1 it says

How Is This an Anti-Hammer?

The specific example above was. . . well. . . a specific example, but there’s a much more general principle hiding inside it. Trying to extract this general principle will hopefully make more clear how this style of reasoning is, in a sense, the “opposite” of the hammer for reabbreviation. Suppose we’re stuck on a problem that looks like . If we can somehow think of a machine M whose derivative is m, then we can use the fundamental hammer and be done with the problem. But what if we get stuck trying to think of such a machine M? As always, we can try to reabbreviate and see if it helps. Suppose we reabbreviate a big scary chunk of symbols inside the integral as s. Let’s call whatever is now inside the integral , even though m(x) and are really just two different abbreviations for the same thing. I’m writing the “hat” on instead of just writing m(s) to emphasize that is the machine m(x) written in s language. It’s not simply m(x) with s plugged in in place of x! For example, if m(x) ≡ [V(x) + 7x − 2]795, and if we abbreviate s ≡ V(x) + 7x − 2, then would be s795. Since we’re throwing two letters around, I’ll change the a and b into x= a and x = b to remind us what we’re saying. Using these ideas lets us write:

It’s not clear how we can use the fundamental hammer while we have two letters floating around. Let’s try to eliminate more x’s by writing them in s language. We need to have a ds instead of a dx on the far right, so we can lie and correct to write dx as , and try to re-express all the remaining x’s (the ones in , and the ones in the sentences x = a and x = b) in s language.

Will this help? We can’t tell. We’re not even looking at a specific problem! This is just an abstract description of what we did in the example above with xex2. Let’s summarize the idea in a box.

Anti-Hammer for Reabbreviation

If we’re stuck on something like

then we’re always free to come up with an abbreviation s for a bunch of scary stuff in m(x), and rewrite m(x) as , where is some hopefully less-scary-looking way of writing m(x). Then we can lie and correct for the lie to rewrite the problem like this:

If we can rewrite all the x’s in s language, we’ll have translated the original problem into a different one. This won’t necessarily make the problem simpler, but it might if we reabbreviate cleverly. For some reason, textbooks call this process “u-substitution,” and use the letter u where we useds. But it’s all just reabbreviation.

As simple as this idea is, it’s hard to come up with abbreviations that describe exactly how simple it is. Here’s another way to think about it. Suppose we’re going through this whole annoying process of (i) reabbreviating scary-looking chunks by s, (ii) lying and correcting for the lie to get everything in s language, and then (iii) staring at the rewritten version of the original problem to see if it looks any easier. It turns out that this process will automatically recognize when the machine m(x) can be thought of as arising from original the hammer for reabbreviation. To see what I mean, let’s look at a specific example. Suppose we’re stuck trying to calculate something like this:

Now, it turns out that this machine m can be thought of as arising from taking the derivative of this big ugly machine

by using the hammer for reabbreviation, but let’s imagine that we don’t notice that fact. We’re just hopelessly stuck trying to calculate the stuff in equation 6.21. Let’s see where the process of reabbreviation will get us, if we were to try it. Suppose that by luck or insight or anything else, we choose to use the abbreviation s ≡ x5+17x − 3. That’s one of the uglier pieces in equation 6.21, so this strategy makes a certain amount of sense. This changes the problem to

Now, this process won’t help us unless we can get everything in s language, so let’s start by trying to translate the dx piece into s language. Since s ≡ x5 + 17x − 3, we know that

Substituting this expression for dx into equation 6.22 collapses everything very nicely, to give us

Now the problem is slightly less crazy-looking! Can we think of a machine whose derivative (thinking of s as the variable) is 51s999? Well, it had better look like # · s1000 so that the power turns into 999 when we differentiate it. But then we have to make sure the # is such that 1000 · # = 51, which means that . Putting it all together, we figured out that an “anti-derivative” of 51s999 is

So the answer to the original problem, which seemed so impossible to begin with, is just M(b) − M(a), where M is the ugly machine in the above equation. Notice that we didn’t have to recognize that m(x) in equation 6.21 arises from using the hammer for reabbreviation on M. We didn’t even have to know what M was! Rather, just by defining s to be an abbreviation for the ugliest piece in equation 6.21 and then translating all the x stuff into s language, we found that we had transformed the scary-looking problem we started with into the much simpler problem of computing . The net effect of this process was that we ended up doing a mathematical dance that, in the end, told us the anti-derivative M(x), even though we ourselves couldn’t just magically think of M(x) all in one step. This reabbreviation process — as difficult as it can be to explain in symbols — effectively bootstraps us up past our own ignorance to a place where we can solve problems that we couldn’t solve without reabbreviating.

6.3.4Collecting the Anti-Hammers

Having forged three anti-hammers, one for each of the originals, let’s summarize all of them in abbreviated form.

Anti-Hammer for Addition (AHA)

Suppose we’re stuck on something that looks like

If we can think of machine f and g for which m(x) = f(x) + g(x), we’re free to break the problem apart like this, if it helps:

Anti-Hammer for Multiplication (AHM)

Suppose we’re stuck on something that looks like

If we can think of machine f and g for which m(x) = f′(x)g(x), we’re free to transform the problem like this, if it helps:

Anti-Hammer for Reabbreviation (AHR)

Suppose we’re stuck on something that looks like

If we can think of an abbreviation s for which m(x) can be written in a simpler looking form , then we’re free to transform the problem like this, if it helps:

6.3.5The Other Fundamental Hammer

At the beginning of this chapter, we discovered the fundamental hammer, and discussed how its basic message was that integrals and derivatives were opposites. However, what we really established is that integrals and derivatives are opposites if the derivative shows up inside the integral. It’s easy to remember the general idea that integrals and derivatives are opposites, but the story would be less elegant if they were only opposites in a given order — that is, if we do the derivative first, and then the integral. This desire for a greater elegance of the narrative motivates us to see if there is any sense in which derivatives and integrals are opposites in the other order. What we might think to try first is to see if we can calculate

but this expression turns out to be a bit misleading. That is, the x in the expression is not really a “variable” in the same sense as x is a variable in an expression like f(x) ≡ x2. In technical jargon, the x in an integral is what is called a “bound variable,” which is to say that it is not something that we can plug a number into, but simply a placeholder. It serves the same purpose as the letter i does in the expression

This is just a fancy way of writing the number 14 (because 1+4+9 = 14), so it doesn’t make sense to plug something like i = 17 into equation 6.25. We could change i to some other letter like j or k in equation 6.25 and the expression would still just be a fancy way of writing the number 14. For the same reasons, the expression doesn’t depend on x, and it’s no different from the expressions and . It might then appear that our problem is solved. Since is just a number, independent of x, we can write

Hmm. . . This is hardly another version of the fundamental hammer. If derivatives always kill integrals from the outside, then it would appear that the two concepts are not opposites after all. However, such a conclusion would be hasty. What we really need is a different way of thinking about the problem. The derivative (with respect to x) of the integral was equal to zero because the integral didn’t depend on x. Maybe if we ask a slightly different question, we’ll get something more interesting. Let’s try to take the derivative with respect to the number on the top of the integral:

where we’re using s instead of x as a label for the “bound variable” to avoid any confusions that might result from using x for both. There are two ways we might unravel this weird expression. First, if we write M for the anti-derivative of m, we can simply use the version of the fundamental hammer that we already discovered to obtain

Another way to show the same thing is to make a tricky, informal argument using the definition of the derivative, like this:

where in passing from the first line to the second we basically just imagined tearing the full area into two pieces, letting us say that [the area from a to (x+tiny)] minus [the area from a to x] is just [the area from x to (x+tiny)]. Well, it might seem like we’re stuck, but if we remember what everything means, then it isn’t too hard to get unstuck. The funny term refers to the area under m’s graph between two points that are infinitely close to each other: x and x+dx. This is therefore the area of a rectangle of width dx and height m(x), which is of course m(x)dx. The final line above is just times that, so the final line must be m(x). Or, tying it all together,

and we therefore obtained the same answer as before: integrals and derivatives undo each other in either direction. To summarize what we’ve shown, I’ll list both versions of the fundamental hammer here, and I’ll write the old version in a slightly different way, to illustrate its relationship to the new version: