The Infinite Jackpot: Putting Our Ideas to Work for Us - The Infinite Beauty of the Infinite Wilderness - Burn Math Class: And Reinvent Mathematics for Yourself (2016)

Burn Math Class: And Reinvent Mathematics for Yourself (2016)

Act III

The Infinite Beauty of the Infinite Wilderness

5. The Infinite Jackpot: Putting Our Ideas to Work for Us

.5.1Testing Our Inventions by Reinventing the Known

This shuddering before the beautiful, this incredible fact that a discovery motivated by a search after the beautiful in mathematics, should find its exact replica in Nature, persuades me to say that beauty is that to which the human mind responds at its deepest and most profound.

—Subrahmanyan Chandrasekhar, Truth and Beauty: Aesthetics and Motivations in Science

With all the talk to this point about the analogy between single- and multivariable calculus on the one hand, and cannibal calculus on the other, a question remains unanswered. Sure, we may have chosen our definitions such that the operations by which derivatives are computed in cannibal calculus are the familiar ones. Sure, we forced derivatives in cannibal calculus to behave sufficiently similarly to those in single- and multivariable calculus that we didn’t really have to learn anything new about how to compute derivatives. We just change notation to δ instead of or d, and we can write all sorts of intelligent-looking equations like , which is really just a generalized and disguised version of the equation .

However, it’s not at all clear at this point whether simply defining derivatives to behave this way will in any sense preserve their meaning in other senses. Just how seriously can we take the analogy between the new and the old? For example, in single- and multivariable calculus, we could find flat points4 of a machine by forcing the machine’s derivative to be zero and then figuring out where that occurs. But now all of our equations are balancing precariously between two different interpretations: one in which we think of machines as the familiar curvy lines that we can graph in two dimensions, and another interpretation in which we think of machines as “vectors with infinitely many slots.” As such, if we simply start with a cannibalistic machine, like

That is, local maxima, local minima, and saddle points, in textbook jargon.

and then force its functional derivative to be zero for all slots of the “vector” f(x) (that is, for all ), then it is far from clear whether the end product of this process will in any sense be the place where the functional is maximized or minimized. Just because we can calculate functional derivatives using all the techniques we’ve always been able to does not necessarily mean that “derivative equals zero” still means “flat point.”

As has been the case throughout our journey, we can’t simply look in a textbook to see if “derivative equals zero” still means “flat point.” And I cannot simply say “Yep, it does. Let’s use that fact.” Therefore, if we want to figure out whether there is a useful sense in which “derivative equals zero” still means “flat point,” then we should do what we’ve always done: look at some simple cases and see if our new idea reproduces what we expect.

To begin with, let’s look at the familiar cannibal machine above, namely, . Since f(x)2 is never negative, it seems intuitively clear that no machine f can make F[f(x)] be negative. Moreover, the only machine f for which F[f(x)] will be exactly zero is the machine f(x) ≡ 0. Thinking of the integral graphically, if f(x) is ever some nonzero number, negative or positive, for all of the points even within a small interval, then f(x)2 will be positive, we’ll get more than zero area, and that will make F[f(x)] bigger than zero. So intuitively, we know that in the space of all possible machines (whatever that means) the machine f(x) ≡ 0 is the one for which F[f(x)] is smallest. Therefore, if “derivative equals zero” still means “flat point” in our new cannibal calculus, then it had better be the case that the mathematics spits out f(x) ≡ 0 as the answer when we do old-fashioned optimization.5 Let’s do that. As we already know, the functional derivative of F is

The term “optimization” refers to our familiar process of finding flat points, where we force the derivative to be zero and then determine which points make that condition true.

Forcing this to be true for all slots gives

which is saying the same thing as for all . So f is always zero, which is exactly what we predicted in advance. Hooray! Let’s try one more simple example to check the validity of our new ideas. At the end of Chapter 6, we showed that the arclength, or length of a machine’s graph between two points a and b, can be written

We demonstrated this by zooming in on the machine’s graph, applying the formula for shortcut distances, and then zooming back out and adding up the tiny lengths.

Now, we all know intuitively that the shortest distance between two points is a straight line, and there’s no point in using our new high-powered machinery to demonstrate this fact. However, we can use this fact to check the validity of our cannibal calculus methods. The reasoning behind this is the same as it was above: if our cannibal calculus methods are indeed working the way we expect them to, then it had better be the case that going through the whole “derivative equals zero” rigmarole will spit out the answer that L[f(x)] is minimized for straight lines. If it doesn’t spit out that answer, then we’ll know our definitions don’t quite behave how we wanted them to.

On the other hand, if this process does spit out the sentence “f is a straight line,” then that would give us more confidence that our cannibal calculus methods are on the right track, and that they may continue to work in cases where we don’t know what to expect. Let’s give it a shot. We start by taking the functional derivative of the machine L[f(x)] defined above, like this:

What now?!

.5.2 Mathematically Enforced Digression

Without a constant misuse of language there cannot be any discovery, any progress.

—Paul Feyerabend, Against Method

At this point, it would appear that we’ve reached an impasse. By “impasse,” I mean:

We have no idea what is. As always, when our symbol gymnastics have resulted in an expression we don’t understand, it’s helpful to go back to the drawing board and ask what everything meant to begin with. Recall that whenever we’re looking at a derivative of any kind, we’re always looking at something like

That is, we begin with a machine M that eats some food. In this case, the food is an entire machine f(x). Then we make a tiny change to the food, changing it from food to food + d(food), and we look at the difference in the machine’s response between the two cases, which is to say dM ≡ M[food +d(food)] − M[food], and the derivative is just the change in output dM divided by the change in input d(food). How can we use this idea to figure out what on earth to do with this:

Well, using the same interpretation as always, the bottom part, , is just the change in food: the δf is an infinitely small function that we add to the original function f in order to determine how L[f(x)]’s response changes. The lets us know where we’re making the change, the f is the name of the function whose value we’re changing at some point, and the δ is just goofy notation that lets us know that the change is infinitely small.

So we’ve specified what we’re changing: we’re making a tiny change to the function f. That takes care of the bottom part of . What about the top part, δf′(x)? Well, again using the same interpretation we’ve been using since the beginning, this is just the tiny change in f′(x) that results from the tiny change we made to . Here’s the important idea: of course when we change a function f(x) a little bit, then its derivative will change a little bit. But! We’re not making two independent changes to f(x) and f′(x). Any changes that occur in f′(x) result from whatever we did to f(x). As such, since δf′(x) is just an abbreviation for “the change in f′(x) that results from whatever tiny change we made to f(x),” we can write this:

where δf(x) is any “tiny function.” If that doesn’t make sense, maybe this will help: the strange symbol δf′(x) should really be written δ[f′(x)] to remind us that it stands for the change in f′(x) that results from whatever we did to f(x). Because of that, we can write

which is just a long way of saying

That is, we can pull primes outside of functional derivatives. Since we’re doing all this to figure out what on earth to do with , we can use the above equation to write

And remember, the prime here meant “derivative with respect to x,” but is constant with respect to x, since refers to some particular slot, so we can go one step further and write:

Finally, remember that was just the “Dirac delta function” we introduced earlier, the function that is zero everywhere except for an infinitely tall spike when . So we’ve now arrived at the completely bizarre equation

. . .What do we do now?!

.5.3Past the Impasse

Rigor is just another word for nothing left to do.

—From Me and Bourbaki McGee by Generalized Janis Joplin6

Okay, that’s not a real song. However, one should attempt to provide citations whenever possible, so if you’re looking for a citation of the nonexistent lyrics quoted above, please refer to the primary source: pg 367 of this book.

In the process of stumbling deeper and deeper into the infinite wilderness, we’ve suddenly bumped up against the equation

It’s far from clear that this makes sense. What on earth is the derivative of the Dirac delta function? The delta function itself, δ(x), was defined to be equal to zero everywhere except at x = 0. So the term is zero everywhere except when . However, the “number” δ(0) wasn’t really a normal number. When we defined the delta function, we found that δ(0) can be thought of as , where dx is infinitely small, the result of which is that δ(x) “kills integrals,” in the sense that , provided of course that is somewhere between a and b. That’s all well and good, but how do we determine the derivative of an infinitely tall, infinitely thin spike?! It would seem as if the expression we’ve bumped into above has no meaning. However, we didn’t come this far into the wilderness just to give up now. Mathematics is ours. We’re creating it ourselves. So let’s confidently forge ahead, by declaring:

We have absolutely no idea what is, so we’ll simply choose to define it to be whatever it has to be in order to obey all of the stuff we already know about derivatives, in particular the hammers for differentiation and the anti-hammers for integration.

If we choose to do this, then we wouldn’t know what the derivative of δ(x) is, but we would know how it behaves: a familiar situation! Let’s do this and see where we get. This confused exploration started back when we got stuck at the end of equation .19. We got stuck because we didn’t know what to do with . However, we’ve shown that whatever this is, it can be thought of as the derivative of the delta function, so

This lets us pick up where we left off, in equation .19, and write

The stuff on the right looks very complicated, but it has the form

If we can determine what to do with anything of this general scary form, then we can proceed. What can we do? Well, we just decided that! We defined the derivative of the delta function to be whatever it has to be in order to obey all our hammers and anti-hammers, so by definition we can use one of those. More concretely, it would be nice if we could somehow move the derivative in equation .21 from the δ over to the M, because we know what to do with the δ function: it just kills whatever integral it’s inside. We want to move the derivative, and fortunately we have a tool that lets us do something like that: our anti-hammer for multiplication from Chapter 6. Applying that to equation .21 gives us the following big pile of fun:

The first term on the right is just zero, unless or , so we’ll just imagine that isn’t one of the endpoints a or b, so that we can keep moving. Having gotten rid of that term, we have

Beautiful! This tells us what the derivative of the delta function does to an arbitrary function inside an integral, namely

Notice how similar this is to the defining behavior of the original delta function itself, which killed integrals in a slightly different way:

Having discovered how the derivative of the delta function behaves when it shows up under integrals, we can pick up where we got stuck yet again, back in equation .20. This lets us write:

Fortunately, we don’t have to actually compute this awful derivative if we remember our original goal. We’re trying to see whether setting the derivative of the arclength functional equal to zero (and then figuring out which functions are the “flat points” in function space) reproduces the result we all know by intuition: that the shortest path between two points is a straight line. As such, we’re setting all of the above equal to zero, so what we really have at this point is

So, we’re forcing the above equation to be true for all possible points . But the above equation is just the derivative of some stuff, and if the derivative of some stuff is zero at every point , then that stuff has to be a constant. Therefore, we can write

Now, it’s not clear what to do, but maybe if we do some symbol gymnastics, we can isolate f′(x). Squaring both sides of the above equation and throwing the bottom over to the right side gives

[f′(x)]2 = c2(1 + [f′(x)]2) = c2 + c2[f′(x)]2

which tells us that

[f′(x)]2(1 − c2) = c2

and therefore,

But if c is just a number we don’t know, then is also just a number we don’t know, so we might as well reabbreviate and call the whole thing a. This lets us write

f′(x) = a

Aaah! The derivative of f is constant! That means f is a line. Or equivalently (we get to say a really fancy thing now), the points in our infinite-dimensional function space that minimize the arclength functional are just the straight lines. To celebrate, let’s write this out as professionally as we can, given our current level of excitement:

Just to remind ourselves why we care so much, this result is exciting not because we derived the fact that the shortest distance between two points is a straight line. We know that already without any mathematics. The exciting thing about this result is that is gives us a much greater degree of confidence that the cannibal calculus we invented in this chapter is on the right track, and moreover, that it’s actually useful. Simply by performing a few simple operations, virtually identical to the operations of single-variable calculus except for a few minor changes of notation, we were effectively able to search an entire infinite-dimensional space of functions for the ones with some particular property, in this case the property of minimizing the arclength functional.

In a sense, we managed to symbolically “consider” an unimaginably large space of possible paths between two points, and find the paths that get from one point to the other using the shortest distance. This result is a massively important milestone in our journey of inventing mathematics for ourselves. It marks the acquisition of a new superpower: the ability to effectively reason about a space with infinitely many dimensions. Simply write down a functional, and we may be able to find the functions that minimize or maximize it using the methods of this chapter.

Our journey has taken us a long way. We’ve climbed from addition and multiplication to infinite-dimensional calculus, and I think it’s time for a break. Let’s summarize what we did in this chapter, and spend the next interlude relaxing. Where should we go? The beach? We could call it something like “Interlude : Building Sandcastles.” Or we could just go back in the book and switch the order of a bunch of sentences to try to confuse the past versions of ourselves who were reading the chapter, and then check to see if we were still confused in the present. If you’re feeling confused after that sentence. . . maybe that’s why. Actually, where do you want to go? I haven’t let you decide where we go for an interlude yet. You don’t have to make up your mind for a page or two. There’s still a Reunion to write. But whatever we do, let’s relax, and most importantly, let’s make sure to stay away from school. We earned it.