Enough Symbol Games! How Can We Picture This - New Is Old - Burn Math Class: And Reinvent Mathematics for Yourself (2016)

Burn Math Class: And Reinvent Mathematics for Yourself (2016)

Act III

N. New Is Old

N.3. Enough Symbol Games! How Can We Picture This?

N.3.1Multidimensional Mind Tricks

Despite the inability of the human mind to directly visualize dimensions higher than three, it is not necessary to abandon your visual intuition in moving to higher-dimensional mathematics. Nor is it necessary to develop magical methods of visualizing spaces of four, or ten, or infinitely many dimensions. If that sounds like a contradiction, read it again. Here’s the trick, and it’s something I was never told in multivariable calculus: a surprisingly large number of mathematicians visualize n-dimensional space simply by (drumroll) picturing three-dimensional space!

If that sounds ridiculous. . . good! But it’s true. This is something we’re not explicitly taught in any course, but it slowly dawns on many students of mathematics as they watch the great mathematicians of the previous generation reasoning out the answer to a question. Many’s the time that I’ve posed a question about higher dimensions to an extremely intelligent mathematician, and seen them: think for a moment, realize s/he couldn’t simply reason out the answer mentally, walk to the blackboard, draw a two-dimensional picture of a two- or three-dimensional object, and in doing so,figure out the answer! I’ve seen this occur more times than I can count, with questions ranging from the four-dimensional (in Riemannian geometry and general relativity) to the infinite-dimensional (in functional analysis and quantum mechanics).

Thinking about two or three dimensions may not help us to visualize n dimensions, but it can certainly help us to reason about n dimensions. It would be nice to experience this feeling ourselves. In the next part of this section, we’ll try to use our visual intuition in three dimensions to derive a fact about n-dimensional calculus. In the following section, we’ll invent the formula

for a machine m(x, y) of two variables. Having done so, we should immediately be able to see why the more general form of this expression, namely

is true for a machine m(x1, x2, . . . , xn) of n variables. As intimidating as these equations look at the moment, we’ll soon see — as we have before — how a simple change of abbreviations can change everything.

N.3.2The Trick in Action

Figure N.1: Visualizing a machine m that eats two numbers and spits out one number. We can picture the two numbers x and y that the machine eats as coordinates on the “ground.” If we feed the point (x, y) on the ground to the machine m, it will spit out a number m(x, y), which we can visualize as the “height” of the graph floating above the point (x, y). Each point on the ground gets a height, so the graph of such a machine is a two-dimensional surface.

Notice that “two in, one out” machine m(x, y) can be visualized as a (possibly curvy) surface floating off the ground. See Figure N.1 for a more detailed explanation.

Now we can extend the idea of our infinite magnifying glass to Figure N.1. When we zoom in on the curvy surface, it should look like a flat plane. Let’s imagine picking an arbitrary point on the graph in Figure N.1 and zooming in infinitely far. The result is the tilted (but not curvy) plane pictured in Figure N.2.

We’ve talked about partial derivatives, but it would be nice if we could come up with a single-derivative-like idea — a “total derivative,” if you will. Instead of thinking of x and y as separate numbers, let’s (just for the moment) think of them as the components of a single object: a “vector,” which we’ll write as (x, y). Now we can proceed as we always have in defining a new type of derivative: start with a machine m, which we feed some food. In this case, the food is the vector (x, y). Then the machine spits out a number m(x, y). Then we make a tiny change to the food, by adding a “tiny vector” (dx, dy) to it. This gives us m(x + dx, y + dy). Then, as always, we see what changed before and after, so we look at m(after) − m(before), or equivalently,1

Usually we would now divide by the tiny change in food to get some kind of derivative. However, we’re changing every slot at once, so food is now an entire vector. So to compute a derivative in the normal sense, we’d have to say what we meant by “dividing by a vector.” Let’s not do that right now, and instead just look at the top piece: dm, as defined above.

Figure N.2: We zoomed in infinitely far on a curvy surface to get a flat (but tilted) surface. In this picture, we’re labeling the horizontal coordinates.

The notation dm stands for a tiny change in the “height” of m’s graph when we change both inputs by an infinitely small amount: changing x to x + dx and changing y to y + dy. At this point, it helps to draw a picture of where we zoomed in on our curvy surface. There are a lot of things we could potentially label, so I’ve split them across three pictures. Here’s what we’re labeling in the three pictures.

1.In Figure N.2, I’ve just drawn four different points on the “ground,” namely (x, y), (x + dx, y), (x, y + dy), and (x + dx, y + dy).

2.In Figure N.3, I’ve drawn the names for the output or “height” of the graph at each of the points from Figure N.2. These heights are called m(x, y), m(x + dx, y), m(x, y + dy), and m(x + dx, y + dy).

3.In Figure N.4, I’ve drawn the tiny differences in height dxm and dym. Recall that the definition of the former was dxmm(x + dx, y) − m(x, y), so if we think of the point (x, y) as the starting point, then dxm is the tiny difference in height that we would rise from walking an infinitely small distance dx in the x direction. Similarly, starting at (x, y), dym is the tiny difference in height that we would rise from walking an infinitely small distance dy in the y direction.

Figure N.3: Same idea as Figure N.2, but now we’re labeling the vertical coordinates.

So far we haven’t really done much except zooming in and naming things, but we’re surprisingly close to deriving the intimidating equations N.18 and N.19! Before we move on, make sure to stare at the three pictures in Figures N.2, N.3, and N.4, and make sure you understand why everything is labeled the way it is.

Now, because pointing is hard in a book, I’ll have to define some terms. I’ll define the “left journey” as follows: imagine starting at the point (x, y) in Figure N.2, and walking left along the graph, along the x axis, until you get to the point above (x + dx, y). This is the first leg of the left journey. After completing the first leg, your height has increased by an amount dxm (make sure you see why). Now, for the second leg of the left journey, imagine continuing walking from your current location up to the top. On this leg of the journey, you’re only walking in the y direction, and your height would increase by an amount height(finish) − height(start), or

I put a hat on the d because we’re already writing dym to mean m(x, y + dy) − m(x, y), and the hat just reminds us that these two quantities aren’t the same, or rather, they don’t look the same.2 The net effect of the left journey was that you went from a height of m(x, y) to a height of m(x + dx, y+ dy), which is what we’ve called dm in equation N.20. So we can write

In a moment, we’ll realize that actually are identical, but only because we zoomed in infinitely far.

Figure N.4: Showing what dxm and dym refer to geometrically.

Similarly, although we don’t need to, we could define the “right journey” as the process of walking in the other direction, and we’d get

where . We can do either and we’d get to the same conclusion, so we can forget about . Okay, now for the fun part. Recall that we defined dm by making tiny changes to both slots at once, because we were wondering if we could come up with a “total derivative.” Notice that equations N.21 and N.22 are almost telling us something about the relationship between the “total differential” dm and the “partial differentials” dxm and dym. The trouble is, each equation only contains one of the familiar “un-hatted” partial differentials, and each has one of those pesky differentials, which aren’t the same thing.

Or are they? Since we’ve zoomed in infinitely far, the object we’re looking at is a flat plane, so for the reason depicted in Figure N.5, the quantity should be the same as dxm, and should be the same as dym. So the “hatted” quantities are the same as the corresponding unhatted ones. Summarizing this:

Figure N.5: Showing why is the same as dxm and why is the same as dym.

This equation is conveying an extremely simple fact: if we take either journey (left or right), then the total height change is the amount from the first leg plus the amount from the second leg. This idea could hardly be simpler. Read it again. Feel how simple it is. Almost pointless to even say. Now for the fun reveal. The simple sentence N.23 is saying the exact same thing as our scary sentence N.18 from earlier.

Further, all we have to do to get from equation N.23 to its scarier-looking twin equation N.18 is to multiply by 1 twice, and switch abbreviations. Let’s do that. Starting with equation N.23, we can do this:

Now, if we simply switch to the (often misleading, but admittedly prettier) standard notation for partial derivatives, we obtain

which is the scarier-looking equation N.18 we set out to derive originally. Recall that the notation above gives equation N.18 a sort of “one in two and two in one” property: ∂x and dx are the same quantity, despite the two symbols used to represent them, while the two ∂m’s are different quantities, expressed by a single symbol. Notation can be goofy.

Next, we can make the logical leap discussed earlier, where we try to convince ourselves of something about n dimensions by visualizing three dimensions. The argument we just made involved a machine with two inputs and one output. So let’s now suppose we’ve got a machine with n inputs and one output, and imagine making a tiny change in all the slots at once. As before, we examine the resulting change in what m spits out:

or if you prefer,

dmm(v + dv) − m(v)

where dv = (dx1, dx2, . . . , dxn). Even though we can’t picture what we’re about to say, if we zoom in infinitely far on the graph of m at some arbitrary point (x1, x2, . . . , xn), we should “see” an n-dimensional parallelogrammy-box-type thing, for the same reasons that we saw a two-dimensional parallelogram in the case above. So as before, it should be the case that

where we’re writing dim instead of dxim to avoid cluttered notation. All this says is that the total height increase between two points in a space we can’t picture is all the individual height increases added up. There’s a height increase d1m from walking a tiny amount dx1 in the x1 direction; there’s a height increase d2m from walking a tiny amount dx2 in the x2 direction, yadda yadda yadda. The total is just the sum of the parts — that’s what equation N.25 is saying, and even though we can’t picture what it’s saying, we can be confident that it’s true, because we understood the basic message from inventing equation N.23.

Alright! Equation N.25 is the result we wanted to derive, so we’re basically done. However, if we define some new abbreviations, we can derive a formula that’s written in all multivariable calculus books. Following the exact same logic as before, we can multiply each term in the above equation by 1, swap the order of multiplication, and then switch to the standard notation, to get

Here’s how we could rewrite this if we wanted to pretend to be a textbook. Let’s define the “dot product” of two vectors v and w to be an operation that bangs the two vectors together to give a number, like this:

v · wv1w1 + v2w2 + . . . + vnwn

That is, just multiply the vectors slot by slot and add up all the results. Having done this, if we define the abbreviations

then we can rewrite equation N.26 like this:

dm = (∇m) · dx

So this fancy looking sentence with a vector full of partial derivatives and an infinitely small vector and a dot product is just telling us the familiar fact from before. The journey’s total change in height is simply: the height change from the first leg, plus the height change from the second, and so on, up to the nth.