Change of coordinates - Transformations - Two-Dimensional Calculus

Two-Dimensional Calculus (2011)

Chapter 3. Transformations

20. Change of coordinates

Having discussed pairs of functions first from the point of view of plane mappings, and then as vector fields, we come now to our third and final interpretation, that of coordinate transformations. We begin with a few general remarks on coordinate systems.

We note first that a function is usually defined by assigning a number to each point of some set. Thus, we may assign to each position on a mountain range its altitude above sea level, or to each point on a frying pan its temperature at a given time. Only later, after each point of the set has been assigned coordinates, can the function be expressed in an explicit form, such as x2, or x + cos y. If the function represents the curvature of a railroad track, then it can be represented as a function of one variable, since we can identify each point along the track by a single coordinate, such as the distance along the track from a fixed point. To represent temperature in the atmosphere would require a function of three variables, since each point is identified by three coordinates. The two-dimensionality of the plane, translated into our terms, means that each point can be represented by a pair of coordinate's. This may be done in many different ways; in order to study a particular function we may choose coordinates that are most convenient for that purpose. Thus, when we were presented with a quadratic form ax2 + 2bxy + cy2, we found it useful to make a rotation of coordinates,

Image

so that in the new coordinates the same function would be represented as AX2 + CY2.

If (x, y) are rectangular coordinates, then the new coordinates (x, y) given by Eqs. (20.1) are rectangular also, meaning that the “coordinate curves,” X = c, Y = d, are straight lines intersecting at right angles. It is often useful to consider more general systems of coordinates. All that is needed is a method of assigning a unique pair of numbers x, y to each point, as a means of quantitatively distinguishing one point from another. If we had previously chosen some other coordinate system, which assigned a pair of numbers x, y to each point, then X and Y would each be a function of the two variables x, y; that is,

Image

where f and g are specific functions. One sometimes uses the expression “curvilinear coordinates” for the general case in which the coordinate curves X = c, Y = d are not straight lines. The crucial property is that the curves X = c, Y = d should intersect at a unique point, so that this point may be identified in the new coordinate system by the pair of values (c, d) (Fig. 20.1).

Conversely, given any pair of functions f(x, y), g(x, y) such that Eqs.(20.2) set up a one-to-one correspondence between (x, y) and (x, y), we may use these equations to define new coordinates.

Image

FIGURE 20.1Curvilinear coordinates

Example 20.1

Let (x, y) be Cartesian coordinates, and let

Image

Here the coordinate curves X = c, Y = d, are circles about the origin and rays from the origin, respectively (Fig. 20.2). The coordinates X, Y are called polar coordinates, and are more commonly designated as r, θ.

There are two points that should be made in connection with general coordinate systems. The first is that a particular coordinate system may not be defined in the whole plane, but only in some domain D. For example, the functions f(x, y), g(x, y) used in Eq. (20.2) may be defined only in a part of the plane. Thus, in the case of polar coordinates, the function tan−1 (y/x)

Image

FIGURE 20.2Polar coordinates

is not defined at the origin, and it is not defined in a single-valued way in the rest of the plane. To avoid all difficulties we must restrict polar coordinates to some domain D (such as the first quadrant, or the upper half-plane) in which it is possible to select a single-valued branch of tan−1 (y/x).

Secondly, in order to be able to apply calculus, the relation between coordinate systems should be sufficiently differentiable so that the derivatives of a function with respect to one coordinate system can be computed in terms of its derivatives with respect to the other.

Suppose then that u(x, y) is continuously differentiable, and let new coordinates X, Y be introduced by Eq. (20.2). Then u becomes a function of x, y, and using the chain rule, we have

Image

These equations may be solved to express uX, uY in terms of ux, uy providing the determinant is not zero. But the determinant is fxgy − gxfy, the Jacobian of f and g with respect to x and y. Our basic requirements for change of coordinates—that there should be a one-to-one correspondence between the pairs x, y and X, Y, such that each pair is given by continuously differentiable functions of the other—are precisely the conditions stated in Th. 17.2, in the context of transformations, guaranteeing that the Jacobian cannot vanish at any point.

Example 20.2

A rotation of coordinates, given by (20.1), is of the form (20.2) with

Image

Substituting into (20.4), we have

Image

Solving for uX, uY, we obtain

Image

Similarly, we can compute derivatives of arbitrary order. For example, applying the chain rule to the function uX, we find

Image

and similarly for uY. Thus, differentiating the first equation of (20.4) with respect to x, we have

Image

In a similar fashion we can express all partial derivatives of u with respect to x, y in terms of its derivatives with respect to X, Y and the derivatives of the change-of-coordinate functions f, g. We note here the expression for the Laplacian,

Image

It is quite clear that if we are too capricious in our choice of new coordinates, then the expressions for most basic quantities become unmanageable. There are, however, many possible choices that greatly simplify these expressions (see, for example, Exs. 20.1, 2). If we substitute f(x, y) = (x2 + y2)1/2, g(x, y) = tan −1 (y/x) into (20.8), then we find the expression for the Laplacian in polar coordinates

Image

or, in the more usual notation, with X = r, Y = θ,

Image

Since (ru r)r = rurr + ur = r(urr+ ur/r), we may rewrite Eq. (20.9) in the form

Image

Let us note just one consequence of this formula.

Lemma 20.1 If u(x, y) is a harmonic function that depends only on the distance r to the origin, then

Image

for suitable constants c and d.

PROOF. If we express u in polar coordinates, then uθ = 0, and since u is harmonic, Δ u = 0. From Eq. (20.10), we have

Image

or

Image

from which Eq. (20.11) follows immediately. image

We consider next what happens to the components of a vector under a change of coordinates. We restrict ourselves to the case of a rotation of coordinates, given by Eqs. (20.1). We recall that the components with respect to a given coordinate system were originally defined in terms of a displacement vector as the difference in coordinates between the beginning and the endpoint. Thus, in the original coordinate system the components would be

Image

while the new components would be

Image

(Fig. 20.3). Substituting the coordinates of the beginning and endpoints in Eq. (20.1), and subtracting, we obtain

Image

Using this fact about displacement vectors as a model, we use Eqs. (20.12) to define the new components imageU, Vimage of an arbitrary vector, after the rotation of coordinates (20.1), if the vector had components imageu, υimage with respect to the original coordinate system. Thus the relation (20.12) between the old and new components of a vector is precisely the same as that between the old and new coordinates of a point.

Let us consider, for example, a function f defined in the plane. It has one expression with respect to the coordinates x, y, and another with respect to

Image

FIGURE 20.3 Components of a vector in different coordinate systems

X, Y. At each point we can consider the vector

Image

which is the gradient of f with respect to the coordinates x, y. By Eq. (20.12), the components of this vector with respect to the new coordinates X, Y are given by

Image

But using Eqs. (20.6), we find that Eq. (20.14) reduces to

Image

This computation shows that the pairs of partial derivatives fx,fy and fX, fY represent the components of the same vector with respect to two different coordinate systems. If we carry out an analogous computation with other pairs, such as fx, −fy and fX, −fY, or fy, fx and fY, fX, we find that these do not represent the same vector. Of course, this is not fortuitous, but is a consequence of the fact that the gradient of a function may be described in an intrinsic fashion in terms of the direction and magnitude of the maximal directional derivative. Since this characterization is independent of coordinates, it follows that whatever expression we obtain for the gradient vector in one (Cartesian) coordinate system must be valid in any other.

We examine next the case of an arbitrary differentiable vector field in some domain D. Again, we assume that we have two coordinate systems x, y and X, Y related by (20.1). Let the components of this vector field be imageu, vimage and (U, Vimage with respect to the two systems, respectively. Each of the four quantities u, υ, U, V may be considered as functions of either x, y or X, Y. Differentiation of the first equation in (20.12) yields

Image

Inserting the first of Eqs. (20.6) (and the analogous equation for the function υ), we obtain

Image

In a similar way, we may express the other partial derivatives UY, VX, and VY in terms of ux, uy , υx, υy. We note the result for Vy:

Image

Adding (20.16) and (20.17), we find

Image

Equation (20.18) states that the divergence of a vector field assigns to each point a number in an “invariant” manner; that is, independent of the choice of (Cartesian) coordinates. As in the case of the gradient, it is the particular combination of derivatives that has this property, and had we considered other expressions, such as uxυy, or uy + υx, the analog of (20.18) would not hold. Unlike the gradient, the divergence has not been described up to now in any intrinsic fashion. However, its invariance under coordinate change, proved above by a purely formal computation, leads us to suspect that there should be a coordinate-free description, from which Eq. (20.18) would follow immediately. We shall, in fact, be able to give just such a description of the divergence of a vector field in Sect. 27, but only after we have developed the basic properties of integration in several variables.

Exercises

20.1For each of the following change-of-coordinate functions X = f(x, y), Y = g(x, y), use Eq. (20.8) to express uxx + uyy in terms of the new coordinates X, Y.

a. Image

b. Image

c. Image

20.2Show that if f(x, y), g(x, y) define a Conformal mapping and if we introduce new coordinates X = f(x, y), Y = g(x, y), then

Image

so that a function u is harmonic with respect to x, y if and only if it is harmonic with respect to X, Y.

20.3Show that if the coordinates (X, Y) are obtained from (x, y) by a rotation of axes, then uxx + uyy = uXX + uYY.

20.4Carry out the derivation of Eq. (20.9) from Eq. (20.8).

20.5Show that for any integer n the function u(x, y) that takes the form rn cos in polar coordinates is a harmonic function.

20.6 a. Write down the form that Eqs. (20.4) take when X and Y are polar coordinates r, θ, and compute image

b. Solve the equations in part a for ur, uθ in terms of ux, uy.

c. Let u(x, y) and υ(x, y) satisfy the Cauchy-Riemann equations. Find corresponding equations, which relate the partial derivatives of u and υ with respect to r and θ.

d. Show that the functions u(x, y), υ(x, y) which are given in polar coordinates by rn cos , rn sin , satisfy the Cauchy-Riemann equations.

e. Let n be a positive integer. Study the transformation F: u(x, y), υ(x, y) which takes the form R = rn, φ = , if polar coordinates (r, Θ) and (R, φ) are introduced in the x, y plane and u, υ plane, respectively. Describe this transformation geometrically, and show that it is a diffeomorphism of the angular sector 0 < θ < π/n onto the upper half-plane. Using part d, show that it is a Conformal mapping.

20.7Let u(x, y) ∈ image in the upper half-plane and xuyyux. Show that there exists a function g(t) of one variable such that u(x, y) = g(x2 + y2). (Hint: see Exs. 7.26 and 7.27, and use Ex. 20.6a.)

20.8Find the function u(x, y) defined for all (x, y) ≠ (0, 0), which satisfies the equation

Image

with u(x, 0) = 0 for all x ≠ 0. (Hint : see Ex. 20.7.)

20.9Find the most general form of a harmonic function in the upper halfplane that is constant along each ray through the origin.

20.10Show that if u(x, y) ∈ image in the whole plane, and uxy 0, then there exist functions G(t), H(t) of one variable such that u(x, y) = G(x) + H(y). (Hint: use Ex. 7.26 to show that ux = g(x), uy = h(y). Set

Image

and use the Corollary to Th. 7.2.)

20.11The equation

Image

where c is a nonzero constant, is called the one-dimensional wave equation.

a. Show that under the change of variables X = x − ct, Y = x + ct, we have

Image

b. Show that if u(x, t) ∈image in the whole plane and u(x, t) satisfies the wave equation, then u(x, t) is of the form

Image

(Hint : use part a and Ex. 20.10.)

20.12Derive the expressions for uxy and uyy analogous to Eq. (20.7) for uxx.

20.13 a. Show that under the change of coordinates x = ex, y = eY, we have

Image

b. Use part a to write down some solutions of the equation

Image

and then verify directly that this equation is satisfied.

20.14 a. Let a, b, c be arbitrary constants. Show that under a rotation of coordinates defined by Eqs. (20.1), we have

Image

and find the explicit expressions for A, B, C in terms of a, b, c and α (Hint : use the expressions in Ex. 20.12.)

b. Show that as a special case of part a,

Image

(In particular, the expression uxx − uyy is not invariant under a rotation of coordinates, as is uxx + uyy by Ex. 20.3.)

c. Let q(x,y) = ax2 + 2bxy + cy2. Show that under the rotation of coordinates defined by (20.1),

Image

where A, B, C are given by the same expressions as in part a.

d. Let λl, λ2 be the maximum and minimum of the quadratic form q(x, y) of part c subject to the condition x2 + y2 = 1. Show that under a suitable rotation of coordinates,

Image

*20.15Find the most general function u(x, y) that satisfies the equation 3 uxx + 10uxy + 3uyy = 0, throughout the plane. (Hint: use Exs. 20.14d and 20.11b.)

20.16 a. Show that under the change of coordinates X = kx, Y = ly, we have

Image

thus, choosing k = imageand l = image, both coefficients can be reduced to ± 1.

b. Show that under a suitable linear change of coordinates in each case, every sum of second-order derivatives with constant coefficients can be reduced to one of the following three forms:

Image

(Hint: use the composition of two linear transformations, first applying Ex. 20.14d, and then Ex. 20.16a.)

c. Find the most general function u(x, y) that satisfies uxx 0 throughout the plane. Deduce that all solutions of the equation auxx + 2buxy + cuyy = 0 in the whole plane can be written down explicitly in case ac − b2 < 0 or ac — b2 = 0, whereas for ac − b2 > 0 they can be expressed in terms of solutions to Laplace’s equation Δu = 0.

*20.17 a.Let X = f(x, y), Y = g(x, y) define a change of coordinates. Let . (x0, y0) be a point such that either

(1) ux(x0, y0) = 0 and uy(x0, y0) = 0, or

(2) the second derivatives of f and g are all zero.

Show that at the point (x0, y0), the expressions derived in Ex. 20.12, together with Eq. (20.7), can be summarized in a single matrix equation:

Image

(Note: matrix multiplication is discussed in the Remark following Ex. 15.21.)

b. Show that under the hypotheses of part a,

Image

c. Show that under the hypotheses of part a,

Image

where

Image

show that this equation contains as a special case the answer to Ex. 20.14a.

d. Let the functions f(x, y), g(x, y) of part a define a dififeomorphism F. For any numbers r, s, let

Image

Show that under the hypotheses of part a,

Image

where the derivatives on the left are evaluated at the point (x0, y0) and those on the right are evaluated at the corresponding point (X0, Y0) = (f(x0, y0), g(x0, y0)).

e. Show that under the hypotheses of part a, the quadratic form for the second directional derivatives image has the same nature whether computed with respect to x, y or X, Y; in other words,

Image

and

Image

are simultaneously positive definite or negative definite, or positive semidefinite, etc.

f. Let F be the diffeomorphism X = f(x, y), Y = g(x,y); given a surface z = h(x, y), consider the corresponding surface z = image(x, y), where image = h image F (see Ex. 17.4). Give a geometric interpretation of part e in terms of these two surfaces. (Note that if F is linear, then hypothesis 2 of part a holds at every point, and therefore so do all sùbsequent parts of this exercise, for an arbitrary function u(x, y). On the other hand, for an arbitrary diffeomorphism F, these results are valid at any point where u = 0.)

20.18Verify Eq. (20.17) and derive similar expressions for UY and Vx.

20.19Using Eqs. (20.16) and (20.17) and the answer to Ex. 20.18, transform each of the following expressions into expressions involving ux, uy, υx, υy.

a. UX − VY

b. UX + VY

c. UY − VX

20.20Show that if imageu, υimage is a harmonic vector field, then the corresponding vector field image U, Vimage obtained by rotating coordinates is also harmonic.

*20.21 a. Show that Eqs. (20.16) and (20.17) and the answer to Ex. 20.18 can be written in the form of a single matrix equation.

Image

b. Show that UXVYUYVX = uxυyuyυx.

RemarkIt may have become apparent in the course of these exercises that the standard notation that we have been using may lead to some real difficulties. These become even more serious later on, when dealing with functions of more than two variables. To cite one example, if w is given as a function of the three variables x, y, z, and if z is in turn a function of x and y, then what is meant by ∂w/∂x? It can have two interpretations, depending on whether we hold y and z fixed and let x vary, or we substitute z as a function of x and y and consider w just as a function of x and y. In the latter case, for fixed y, as x varies z also varies. Thus the notation ∂w/∂x should not be used in that situation, but be replaced by a more precise notation. Also in the present section, any possible confusion concerning the notation ux and ux could be avoided by writing u = h(x, y) and u = image(x, y), where image(x, y) = h(f(x, y), g(x, y)). Then hx, hY should be related to imagex, imagey.

It may be appropriate in this connection to touch on a basic problem in functional notation. Consider, for example, the question “what happens to the function f(x, y) under a rotation of coordinates?” This question, taken at face value, is meaningless. The function f(x, y) assigns to every pair of numbers a third number. If f(x, y) = x2 + y, then f(π, 1) = π2 + 1, f(X, Y) = X2 + Y, f(r, θ) = r2 + θ. What is implied in the above question is that we consider the pair of numbers x, y to define a point p in the plane; we use the function of two variables f(x, y) to define a function, say g, which assigns to each point p of the plane a number g(p) by the rule g(p) = f(x, y) if (x, y) are the coordinates of the point p in the given coordinate system. Then if the point p is represented by (X, Y) in terms of a new coordinate system, we may define a new function of two variables, say image(X, Y), by setting image(X, Y) = g(p). The question then, is how this new function image(X, Y) is related to the original function f(x, y).

The problem, in brief, is whether in speaking of a function f, we mean a point function (such as g(p) in the above discussion), which assigns a number to each point, and which takes on different forms in different coordinate systems, or a function of two variables, which assigns a fixed number to each pair of numbers according to a given rule, and which has nothing to do with the choice of coordinates. The fact is that both meanings are in common use, and in any given case the intended meaning must be deduced from the context. The important thing is to understand precisely what is meant in a given situation, and then the notation should not pose any problems. In some cases a more cumbersome notation is desirable in order to eliminate any possible confusion, but in others it may defeat one of the basic purposes of mathematical symbolism, which is to compress a large amount of information into compact form. It is well to remember that mathematical notation can be an invaluable aid to thought, but it cannot be a substitute for thought.

1 See, for example, Section 12.4 of [36]. A proof which is purely two-dimensional is given in Ex. 26.29 of this book. Other proofs may be found in Chapter III, Section 3 of [11], and in Section 8.4 of [19].

2 The term “Conformal mapping” is sometimes used to include more general mappings whose Jacobian is different from zero and which preserve angles, even if the mappings are not one-to-one in the whole domain. An example is the mapping u = x2y2, υ = 2xy, where the domain D is the whole plane minus the origin. Note that Th. 18.2 is valid for such mappings.

3 See also Fig. 16.5 and the related discussion of the behavior of this transformation near the point (1, 1).