COMPLEX NUMBERS - THE NUMBER SYSTEM OF MATHEMATICS - What Is Mathematics? An Elementary Approach to Ideas and Methods, 2nd Edition (1996)

What Is Mathematics? An Elementary Approach to Ideas and Methods, 2nd Edition (1996)

CHAPTER II. THE NUMBER SYSTEM OF MATHEMATICS

§5. COMPLEX NUMBERS

1. The Origin of Complex Numbers

For many reasons the concept of number has had to be extended even beyond the real number continuum by the introduction of the so-called complex numbers. One must realize that in the historical and psychological development of mathematics, all these extensions and new inventions were by no means the products of some one individual’s efforts. They appear rather as the outcome of a gradual and hesitant evolution for which no single person can receive major credit. It was the need for more freedom in formal calculations that brought about the use of negative and rational numbers. Only at the end of the middle ages did mathematicians begin to lose their feeling of uneasiness in using these concepts, which did not appear to have the same intuitive and concrete character as do the natural numbers. It was not until the middle of the nineteenth century that mathematicians fully realized that the essential logical and philosophical basis for operating in an extended number domain is formalistic; that extensions have to be created by definitions which, as such, are free, but which are useless if not made in such a way that the prevailing rules and properties of the original domain are preserved in the larger domain. That these extensions may sometimes be linked with “real” objects and in this way provide tools for new applications is of the highest importance, but this can provide only a motivation and not a logical proof of the validity of the extension.

The process which first requires the use of complex numbers is that of solving quadratic equations. We recall the concept of the linear equation, ax = b, where the unknown quantity x is to be determined. The solution is simply x = b/a, and the requirement that every linear equation with integral coefficients a ≠ 0 and b shall have a solution necessitated the introduction of the rational numbers. Equations such as

(1)  x2 = 2,

which has no solution x in the field of rational numbers, led us to construct the wider field of real numbers in which a solution does exist. But even the field of real numbers is not wide enough to provide a complete theory of quadratic equations. A simple equation like

(2)  x2 = –1

has no real solution, since the square of any real number is never negative.

We must either be content with the statement that this simple equation is not solvable, or follow the familiar path of extending our concept of number by introducing numbers that will make the equation solvable. This is exactly what is done when we introduce the new symbol i by definingi2 = – 1. Of course this object i, the “imaginary unit,” has nothing to do with the concept of a number as a means of counting. It is purely a symbol, subject to the fundamental rule i2 = – 1, and its value will depend entirely on whether by this introduction a really useful and workable extension of the number system can be effected.

Since we wish to add and multiply with the symbol i as with an ordinary real number, we should be able to form symbols like 2i, 3i, –i, 2 + 5i, or more generally, a + bi, where a and b are any two real numbers. If these symbols are to obey the familiar commutative, associative, and distributive laws of addition and multiplication, then, for example,

(2 + 3i) + (1 + 4i) = (2 + 1) + (3 + 4)i = 3 + 7i,

(2 + 3i)(1 + 4i) = 2 + 8i + 3i + 12i-2

  = (2 - 12) + (8 + 3)i = -10 + 11i.

Guided by these considerations we begin our systematic exposition by making the following definition: A symbol of the form a + bi, where a and b are any two real numbers, shall be called a complex number with real part a and imaginary part b. The operations of addition and multiplication shall be performed with these symbols just as though i were an ordinary real number, except that i2 shall always be replaced by –1. More precisely, we define addition and multiplication of complex numbers by the rules

(3)  (a + bi) + (c + di) = (a + c) + (b + d)i,

   (a + bi) (c + di) = (acbd) + (ad + bc)i.

In particular, we have

(4) (a + bi)(abi) = a2abi + abib2i2 = a2 + b2.

On the basis of these definitions it is easily verified that the commutative, associative, and distributive laws hold for complex numbers. Moreover, not only addition and multiplication, but also subtraction and division of two complex numbers lead again to numbers of the form a + bi, so that the complex numbers form afield (see p. 56):

(5)  image

(The second equation is meaningless when c + di= 0 + 0i, for then c2 + d2 = 0. So again we must exclude division by zero, i.e. by 0 + 0i.) For example,

image

The field of complex numbers includes the field of real numbers as a subfield, for the complex number a+ 0i is regarded as the same as the real number a. On the other hand, a complex number of the form 0 + bi = bi is called a pure imaginary number.

Exercises: 1) Express image in the form a + bi.

2) Express

image

in the form a + bi.

3) Express in the form a + bi:

image

4) Calculate image. (Hint: Write image = x + yi, square, and equate real and imaginary parts.)

By the introduction of the symbol i we have extended the field of real numbers to a field of symbols a + bi in which the special quadratic equation

x2 = –1

has the two solutions x = i and x = –i. For by definition, i·i = (–i)(–i) = i2 = − 1. In reality we have gained much more: we can easily verify that now every quadratic equation, which we may write in the form

(6)  ax2 + bx + c = 0,

has a solution. For from (6) we have

(7)  image

Now if b2 – 4ac ≥ 0, then image is an ordinary real number, and the solutions (7) are real, while if b2 – 4ac < 0, then 4acb2 > 0 and image·i, so that the solutions (7) are complex numbers. For example, the solutions of the equation

x2 – 5x + 6 = 0

are image, while the solutions of the equation

x2 – 2x + 2 = 0,

are image.

2. The Geometrical Interpretation of Complex Numbers

As early as the sixteenth century mathematicians were compelled to introduce expressions for square roots of negative numbers in order to solve all quadratic and cubic equations. But they were at a loss to explain the exact meaning of these expressions, which they regarded with superstitious awe. The name “imaginary” is a reminder of the fact that these expressions were considered to be somehow fictitious and unreal. Finally, early in the nineteenth century, when the importance of these numbers in many branches of mathematics had become manifest, a simple geometric interpretation of the operations with complex numbers was provided which set to rest the lingering doubts about their validity. Of course, such an interpretation is unnecessary from the modern point of view in which the justification of formal calculations with complex numbers is given directly on the basis of the formal definitions of addition and multiplication. But the geometric interpretation, given at about the same time by Wessel (1745-1818), Argand (1768-1822) and Gauss, made these operations seem more natural from an intuitive standpoint, and has ever since been of the utmost importance in applications of complex numbers in mathematics and the physical sciences.

This geometrical interpretation consists simply in representing the complex number z = x + yi by the point in the plane with rectangular coördinates x, y. Thus the real part of z is its x-coördinate, and the imaginary part is its y-coördinate. A correspondence is thereby established between the complex numbers and the points in a “number plane,” just as a correspondence was established in §2 between the real numbers and the points on a line, the number axis. The points on the x-axis of the number plane correspond to the real numbers z = x + 0i, while the points on the y-axis correspond to the pure imaginary numbers z = 0 + yi.

If

z = x + yi

is any complex number, we call the complex number

image = xyi

the conjugate of z. The point image is represented in the number plane by the reflection of the point z in the x-axis as in a mirror. If we denote the distance of the point z from the origin by p, then by the Pythagorean theorem

image

Fig. 22. Geometrical representation of complex numbers. The point z has the rectangular coördinate x, y.

p2 = x2 + y2 = (x + yi)(xyi) = z-image.

The real number image is called the modulus of z, and written

p = | z |.

If z lies on the real axis, its modulus is its ordinary absolute value. The complex numbers with modulus 1 lie on the “unit circle” with center at the origin and radius 1.

If | z | = 0 then z = 0. This follows from the definition of | z | as the distance of z from the origin. Moreover the modulus of the product of two complex numbers is equal to the product of their moduli:

| z1 · z2 | = | z1 | · | z2 |.

This will follow from a more general theorem to be proved on page 95.

Exercises: 1. Prove this theorem directly from the definition of multiplication of two complex numbers, z1 = x1 + y2i and z2 = x2 + y2i.

2. From the fact that the product of two real numbers is 0 only if one of the factors is 0, prove the corresponding theorem for complex numbers. (Hint: Use the two theorems just stated.)

From the definition of addition of two complex numbers, z1 = x1 + y1i and z2 = x2 + y2i, we have

z1 + z2 = (x1 + x2) + (y1 + y2)i.

Hence the point z1 + z2 is represented in the number plane by the fourth vertex of a parallelogram, three of whose vertices are the points O, z1, z2. This simple geometrical construction for the sum of two complex numbers is of great importance in many applications. From it we can deduce the important consequence that the modulus of the sum of two complex numbers does not exceed the sum of the moduli (compare p. 58):

image

Fig. 23. Parallelogram law of addition of complex numbers.

| z1 + z2 | ≤ | z1 | + | z2 |.

This follows from the fact that the length of any side of a triangle cannot exceed the sum of the lengths of the other two sides.

Exercise: When does the equality | z1 + z2 | = | z1 | + | z2 | hold?

The angle between the positive direction of the x–axis and the line Oz is called the angle of z, and is denoted by φ p (Fig. 22). The modulus of image is the same as the modulus of z,

| image | = | z |,

but the angle of image is the negative of the angle z,

image = – ø.

Of course, the angle of z is not uniquely determined, since any integral multiple of 360° can be added to or subtracted from an angle without affecting the position of its terminal side. Thus

ø, ø + 360°, ø + 720°, ø + 1080° ···,

ø – 360°, ø – 720°, ø – 1080°, ···

all represent graphically the same angle. By means of the modulus ρ and the angle ø, the complex number z can be written in the form

(8)  z = x + yi = p(cos ø + i sin ø);

for, by the definition of sine and cosine (see p. 277),

x = ρ cos ø, y = ρ sin ø.

E.g. for z = i, ρ = 1, ø = 90°, so that i = 1 (cos 90° + i sin 90°);

image

The reader should confirm these statements by substituting the values of the trigonometrical functions.

The trigonometrical representation (8) is of great value when two complex numbers are to be multiplied. If

  z = ρ(cos ø + i sin ø),

and z′ = ρ′(cos ø′ + i sin ø′),

then zz′ = ρρ′{(cos ø cos ø′ – sin ø sin ø′) + i(cos ø sin ø′ + sin ø cos ø′)}

Now, by the fundamental addition theorems for the sine and cosine,

cos ø cos ø′ – sin ø sin ø′ = cos (ø + ø′),

cos ø sin ø′ + sin ø cos ø′ = sin (ø + ø′).

Hence

(9)  zz′ = ρρ′{cos (ø + ø′) + i sin (ø + ø′)}.

This is the trigonometrical form of the complex number with modulus pp′ and angle ø + ø′. In other words, to multiply two complex numbers, we multiply their’ moduli and add their angles (Fig. 24). Thus we see that multiplication of complex numbers has something to do with rotation. To be more precise, let us call the directed line segment pointing from the origin to the point z the vector z; then ρ = | z | will be its length. Let z′ be a number on the unit circle, so that p′ = 1; then multiplying z by z′ simply rotates the vector z through the angle ø′. If ρ′ ≠ 1, the length of the vector has to be multiplied by ρ′ after the rotation. The reader may illustrate these facts by multiplying various numbers by z1 = i (rotating by 90°); z2 = –i (rotating by 90° in the opposite sense); z3 = 1 + i; and z4 = 1 – i.

image

Fig. 24. Multiplication of to complex numbers; the angles are added and the moduli multiplied.

Formula (9) has a particularly important consequence when z = z′, for then we have

z2 = ρ2(cos 2dø + i sin 2ø).

Multiplying this result again by z we obtain

z3 = p3(cos 3ø + i sin 3ø),

and continuing indefinitely in this way,

(10)  zn = ρn (cos nø + i sin nø) for any integer n.

In particular, if z is a point on the unit circle, with ρ = 1, we obtain the formula discovered by the English mathematician A. De Moivre (1667-1754):

(11)  (cos ø + i sin ø)n = cos nø + i sin nø.

This formula is one of the most remarkable and useful relations in elementary mathematics. An example will illustrate this. We may apply the formula for n= 3 and expand the left hand side according to the binomial formula,

(u + v)3 = u3 + 3u2v + 3uv2 + v3,

obtaining the relation

cos 3ø + i sin 3ø = cos3 ø – 3 cos ø sin2 ø + i(3 cos2 ø sin ø – sin3 ø).

A single equation such as this between two complex numbers amounts to a pair of equations between real numbers. For when two complex numbers are equal, both real and imaginary parts must be equal. Hence we may write

cos 3ø = cos3ø – 3 cos ø sin2ø, sin 3ø = 3 cos2 ø sin ø – sin3 ø.

Using the relation

cos2 ø + sin2 ø = 1,

we have finally

cos 3ø = cos3 ø – 3 cos ø(1 – cos2 ø) = 4 cos3 ø – 3 cos ø,

sin 3ø = – 4 sin3 ø + 3 sin ø.

Similar formulas, expressing sin nø and cos nø in terms of powers of sin ø and cos ø respectively, can easily be obtained for any value of n.

Exercises: 1) Find the corresponding formulas for sin 4ø and cos 4ø.

2) Prove that for a point, z = cos ø + i sin ø, on the unit circle, 1/z = cos ø – i sin ø.

3) Prove without calculation that (a + bi)/(abi) always has the absolute value 1.

4) If z1 and z2 are two complex numbers prove that the angle of z1z2 is equal to the angle between the real axis and the vector pointing from z2 to z1.

5) Interpret the angle of the complex number (z1z2)/(z1z3) in the triangle formed by the points z1, z2, and z3.

6) Prove that the quotient of two complex numbers with the same angle is real.

7) Prove that if for four complex numbers z1, z2, z3, z4 the angles of image and image are the same, then the four numbers lie on a circle or on a straight line, and conversely.

8) Prove that four points z1, z2, z3, z4 lie on a circle or on a straight line if and only if

image

is real.

3. De Moivre’s Formula and the Roots of Unity

By an nth root of a number a we mean a number b such that bn = a. In particular, the number 1 has two square roots, 1 and –1, since 12 = (−1)2 = 1. The number 1 has only one real cube root, 1, while it has four fourth roots: the real numbers 1 and – 1, and the imaginary numbers i and –i. These facts suggest that there may be two more cube roots of 1 in the complex domain, making a total of three in all. That this is the case may be shown at once from De Moivre’s formula.

image

Fig. 25. The twelve twelfth roots of 1.

We shall see that in the field of complex numbers there are exactly n different nth roots of 1. They are represented by the vertices of the regular n-sided polygon inscribed in the unit circle and having the point z = 1 as one of its vertices. This is almost immediately clear from Figure 25(drawn for the case n = 12). The first vertex of the polygon is 1. The next is

(12) image

since its angle must be the nth part of the total angle 360°. The next vertex is α. α = α2, since we obtain it by rotating the vector α through angle image. The next vertex is α3, etc., and finally, after n steps, we are back at the vertex 1, i.e., we have

αn = 1,

which also follows from formula (11), since

image

It follows that α1 = α is a root of the equation xn = 1. The same is true for the next vertex image. We can see this by writing

(α2)n = α2n = (αn)2 = (1)2 = 1,

or from De Moivre’s formula:

image

In the same way we see that all the n numbers

1, α, α2, α3, ···, αn–1

are nth roots of 1. To go farther in the sequence of exponents or to use negative exponents would yield no new roots. For α–1 = 1/α = αn/α = αn–1 and αn = 1, αn + l = (α)nα = 1. α = α, etc., so that the previous values would simply be repeated. It is left as an exercise to show that there are no other nth roots.

If n is even, then one of the vertices of the n-sided polygon will lie at the point –1, in accordance with the algebraic fact that in this case —1 is an nth root of 1.

The equation satisfied by the nth roots of 1

(13)  xn – 1 = 0

is of the nth degree, but it can easily be reduced to an equation of the (n – 1)st degree. We use the algebraic formula

(14)  xn– 1 = (x – 1) (xn–1 + xn–2 + xn–3 + ··· + 1).

Since the product of two numbers is 0 if and only if at least one of the two numbers is 0, the left hand side of (14) vanishes only if one of the two factors on the right hand side is zero, i.e. only if either x = 1, or the equation

(15)  xn–1 + xn–2 + xn–3 + ··· + x + 1 = 0

is satisfied. This, then, is the equation which must be satisfied by the roots α, α2, ···· αn–1; it is called the cyclotomic (circle-dividing)

equation. For example, the complex cube roots of 1,

image

are the roots of the equation

x2 + x + 1 = 0,

as the reader will readily see by direct substitution. Likewise the fifth roots of 1, other than 1 itself, satisfy the equation

(16)  x4 + x3 + x2 + x + 1 = 0.

To construct a regular pentagon, we have to solve this fourth degree equation. By a simple algebraic device it can be reduced to a quadratic equation in the quantity w = x + 1/x. We divide (16) by x2 and rearrange the terms:

image

or, since (x + 1/x)2 = x2 + 1/x2 + 2, we obtain the equation

w2 + w – 1 = 0.

By formula (7) of Article 1 this equation has the roots

image

Hence the complex fifth roots of 1 are the roots of the two quadratic equations

image

and

image

which the reader may solve by the formula already used.

Exercise: 1) Find the 6th roots of 1. 2) Find (1 + i)11.

3) Find all the different values of image.

4) Calculate image.

*4. The Fundamental Theorem of Algebra

Not only is every equation of the form ax2 + bx + c = 0 or of the form xn – 1 = 0 solvable in the field of complex numbers, but far more is true: Every algebraic equation of any degree n with real or complex coefficients,

(17)  f(x) = xn + an–1 xn–1 + an–2xn–2 + ··· + a1x + a0 = 0,

has solutions in the field of complex numbers. For equations of the 3rd and 4th degrees this was established in the sixteenth century by Tartaglia, Cardan, and others, who solved such equations by formulas essentially similar to that for the quadratic equation, although much more complicated. For almost two hundred years the general equations of 5th and higher degree were intensively studied, but all efforts to solve them by similar methods failed. It was a great achievement when the young Gauss in his doctoral thesis (1799) succeeded in giving the first complete proof that solutions exist, although the question of generalizing the classical formulas, which express the solutions of equations of degree less than 5 in terms of the rational operations plus root extraction, remained unanswered at the time. (See p. 118.)

Gauss’s theorem states that for any algebraic equation of the form (17), where n is a positive integer and the a’s are any real or even complex numbers, there exists at least one complex number α = c + di such that

f(α) = 0.

The number α is called a root of the equation (17). A proof of this theorem will be given on page 269. Assuming its truth for the moment, we can prove what is known as the fundamental theorem of algebra (it should more fittingly be called the fundamental theorem of the complex number system): Every polynomial of degree n,

(18)   f(x) = xn + αn–1 xn–1 + ··· + α1x + α0,

can be factored into the product of exactly n factors,

(19)   f(x) = (xα1) (xα2) ··· (xαn),

where α1, α2, α3, ···, αn are complex numbers, the roots of the equation f(x) = 0. As an example illustrating this theorem, the polynomial

f(x) = x4 – 1

may be factored into the form

f(x) = (x – 1) (xi) (x + i) (x + 1).

That the α’s are roots of the equation f(x) = 0 is evident from the factorization (19), since for x= αr, one factor of f(x), and hence f(x) itself, is equal to zero.

In some cases the factors (xα1), (xα2), ··· of a polynomial f(x) of degree n will not all be distinct, as in the example

f(x) = x2 – 2x + 1 = (x – 1) (x – 1),

which has but one root, x = 1, “counted twice” or “of multiplicity 2.” In any case, a polynomial of degree n can have no more than n distinct factors (xα) and the corresponding equation n roots.

To prove the factorization theorem we again make use of the algebraic identity

(20) xkαk = (xα) (xk–1 + αxk–2 + α2 xk–3 + · · · + αk–2x + αk–1),

which for α = 1 is merely the formula for the geometrical series. Since we are assuming the truth of Gauss’s theorem, we may suppose that α = α1 is a root of equation (17), so that

Image

Subtracting this from f(x) and rearranging the terms, we obtain the identity

(21) Image

Now, because of (20), we may factor out (xα1) from every term of (21), so that the degree of the other factor of each term is reduced by 1. Hence, on rearranging the terms again, we find that

f (x) = (xα1)g(x),

where g(x) is a polynomial of degree n – 1:

g(x) = xn–1 + bn–2xn–2 + · · · + b1x + b0.

(For our purposes it is quite unnecessary to calculate the coefficients bk.) Now we may apply the same procedure to g(x). By Gauss’s theorem there exists a root α2 of the equation g(x) = 0, so that

g (x) = (xα2) h (x),

where h (x) is a polynomial of degree n – 2. Proceeding a total of (n – 1) times in the same way (of course, this phrase is merely a substitute for an argument by mathematical induction) we finally obtain the complete factorization

(22)  f(x) = (xα1)(xα2)(xα3) · · · (xαn).

From (22) it follows not only that the complex numbers α1, α2, · · ·, αn are roots of the equation (17), but also that they are the only roots. For if y were a root of equation (17), then by (22)

f(y) = (yα1)(yα2) · · · (yαn) = 0.

We have seen on page 94 that a product of complex numbers is equal to 0 if and only if one of the factors is equal to 0. Hence one of the factors (yαr) must be 0, and y must be equal to αr, as was to be shown.