RULER AND COMPASS - A Book of Abstract Algebra

A Book of Abstract Algebra, Second Edition (1982)

Chapter 30. RULER AND COMPASS

The ancient Greek geometers considered the circle and straight line to be the most basic of all geometric figures, other figures being merely variants and combinations of these basic ones. To understand this view we must remember that construction played a very important role in Greek geometry: when a figure was defined, a method was also given for constructing it. Certainly the circle and the straight line are the easiest figures to construct, for they require only the most rudimentary of all geometric instruments: the ruler and the compass. Furthermore, the ruler, in this case, is a simple, unmarked straightedge.

Rudimentary as these instruments may be, they can be used to carry out a surprising variety of geometric constructions. Lines can be divided into any number of equal segments, and any angle can be bisected. From any polygon it is possible to construct a square having the same area, or twice or three times the area. With amazing ingenuity, Greek geometers devised ways to cleverly use the ruler and compass, unaided by any other instrument, to perform all kinds of intricate and beautiful constructions. They were so successful that it was hard to believe they were unable to perform three little tasks which, at first sight, appear to be very simple: doubling the cube, trisecting any angle, and squaring the circle. The first task demands that a cube be constructed having twice the volume of a given cube. The second asks that any angle be divided into three equal parts. The third requires the construction of a square whose area is equal to that of a given circle. Remember, only a ruler and compass are to be used!

Mathematicians, in Greek antiquity and throughout the Renaissance, devoted a great deal of attention to these problems, and came up with many brilliant ideas. But they never found ways of performing the above three constructions. This is not surprising, for these constructions are impossible! Of course, the Greeks had no way of knowing that fact, for the mathematical machinery needed to prove that these constructions are impossible—in fact, the very notion that one could prove a construction to be impossible—was still two millennia away.

The final resolution of these problems, by proving that the required constructions are impossible, came from a most unlikely source: it was a by-product of the arcane study of field extensions, in the upper reaches of modern algebra.

To understand how all this works, we will see how the process of ruler-and-compass constructions can be placed in the framework of field theory. Clearly, we will be making use of analytic geometry.

If image is any set of points in the plane, consider operations of the following two kinds:

1.Ruler operation: Through any two points in image, draw a straight line.

2.Compass operation: Given three points A, B, and C in image, draw a circle with center C and radius equal in length to the segment AB.

The points of intersection of any two of these figures (line-line, line-circle, or circle-circle) are said to be constructible in one step from image. A point P is called constructible from image if there are points P1, P2, …, Pn = P such that P1 is constructible in one step from image, P2 is constructible in one step from image, and so on, so that Pi is constructible in one step from image.

As a simple example, let us see that the midpoint of a line segment AB is constructible from the two points A and B in the above sense. Well, given A and B, first draw the line AB. Then, draw the circle with center A and radius image and the circle with center A and radius image; let C and D be the points of intersection of these circles. C and D are constructible in one step from {A, B}. Finally, draw the line through C and D; the intersection of this line with AB is the required midpoint. It is constructible from {A, B}.

image

As this example shows, the notion of constructible points is the correct formalization of the intuitive idea of ruler-and-compass constructions.

We call a point in the plane constructible if it is constructible from image × image, that is, from the set of all points in the plane with rational coefficients.

How does field theory fit into this scheme? Obviously by associating with every point its coordinates. More exactly, with every constructible point P we associate a certain field extension of image, obtained as follows:

Suppose P has coordinates (a, b) and is constructed from image × image in one step. We associate with P the field image(a, b), obtained by adjoining to image the coordinates of P. More generally, suppose P is constructible from image × image in nsteps: there are then n points P1, P2, …, Pn = P such that each Pi is constructible in one step from image × image image {P1, …, Pi 1}. Let the coordinates of P1, …, Pn be (a1, b1), …, (an, bn), respectively. With the points P1, …, Pn we associate fields K1, …, Kn where K1 = image(a1, b1), and for each i > 1,

Ki = Ki 1 (ai,bi)

Thus, K1 = image(a1, b1), K2 = K1(a2, b2), and so on: beginning with image, we adjoin first the coordinates of P1, then the coordinates of P2, and so on successively, yielding the sequence of extensions

imageK1K2 ⊆ · · · ⊆ Kn = K

We call K the field extension associated with the point P.

Everything we will have to say in the sequel follows easily from the next lemma.

Lemma If K1, …, Kn are as defined previously, then [Ki : Ki 1] = 1,2, or 4.

PROOF: Remember that Ki 1 already contains the coordinates of P1,. . ., Pi 1, and Ki is obtained by adjoining to Ki 1 the coordinates xi, yi of Pi. But Pi is constructible in one step from image × image image {P1,. . ., Pi 1}, so we must consider three cases, corresponding to the three kinds of intersection which may produce Pi, namely: line intersects line, line intersects circle, and circle intersects circle.

Line intersects line: Suppose one line passes through the points (a1, a2) and (b1, b2), and the other line passes through (c1, c2) and (d1, d2). We may write equations for these lines in terms of the constants a1, a2, b1, b2, c1, c2and d1, d2 (all of which are in Ki 1), and then solve these equations simultaneously to give the coordinates x, y of the point of intersection. Clearly, these values of x and y are expressed in terms of a1, a2, b1, b2, c1, c2, d1, d2, hence are still in Ki 1. Thus, Ki, = Ki 1.

Line intersects circle: Consider the line AB and the circle with center C and radius equal to the distance image. Let A, B, C have coordinates (a1, a2), (b1, b2), and (c1, c2), respectively. By hypothesis, Ki 1 contains the numbers a1, a2, b1, b2, c1, c2, as well as k2 = the square of the distance image. (To understand the last assertion, remember that Ki 1 contains the coordinates of D and E; see the figure and use the Pythagorean theorem.)

image

Now, the line AB has equation

image

and the circle has equation

(xc1)2 + (yc2)2 = k2 (2)

Solving for x in (1) and substituting into (2) gives

image

This is obviously a quadratic equation, and its roots are the x coordinates of S and T. Thus, the x coordinates of both points of intersection are roots of a quadratic polynomial with coefficients in Ki 1. The same is true of the ycoordinates. Thus, if Ki = Ki 1(xi, yi) where (xi, yi) is one of the points of intersection, then

image

{This assumes that xi, yi, ∉ Ki 1. If either xi or yi, or both are already in Ki 1, then [Ki 1,(xi, yi) : Ki 1] = 1 or 2.}

Circle intersects circle: Suppose the two circles have equations

x2 + y2 + ax + by + c = 0 (3)

and

x2 + y2 + dx + ey + f = 0 (4)

Then both points of intersection satisfy

(ad)x + (be)y + (cf) = 0 (5)

obtained simply by subtracting (4) from (3). Thus, x and y may be found by solving (4) and (5) simultaneously, which is exactly the preceding case. ■

We are now in a position to prove the main result of this chapter:

Theorem 1: Basic theorem on constructible points If the point with coordinates (a, b) is constructible, then the degree of image(a) over image is a power of 2, and likewise for the degree of image(b) over image.

PROOF: Let P be a constructible point; by definition, there are points P1. . ., Pn with coordinates (a1, b1), …, (an, bn) such that each Pi is constructible in one step from image × image image {P1,..., Pi 1}, and Pn = P. Let the fields associated with P1,. . ., Pn be K1,. . ., Kn. Then

[Kn : image] = [Kn : Kn 1] [Kn 1 : Kn 2] ⋯ [K1 : image]

and by the preceding lemma this is a power of 2, say 2m. But

[Kn : image] = [Kn : image(a)][image(a):image]

hence[image(a) : image] is a factor of 2m, hence also a power of 2. ■

We will now use this theorem to prove that ruler-and-compass constructions cannot possibly exist for the three classical problems described in the opening to this chapter.

Theorem 2Doubling the cube” is impossible by ruler and compass.

image

PROOF: Let us place the cube on a coordinate system so that one edge of the cube coincides with the unit interval on the x axis. That is, its endpoints are (0,0) and (1,0). If we were able to double the cube by ruler and compass, this means we could construct a point (c, 0) such that c3 = 2. However, by Theorem 1, [image(c) : image] would have to be a power of 2, whereas in fact it is obviously 3. This contradiction proves that it is impossible to double the cube using only a ruler and compass. ■

Theorem 3Trisecting the angle” by ruler and compass is impossible. That is, there exist angles which cannot be trisected using a ruler and compass.

PROOF: We will show specifically that an angle of 60° cannot be trisected. If we could trisect an angle of 60°, we would be able to construct a point (c, 0) (see figure) where c = cos 20°; hence certainly we could construct (b,0) where b =2 cos 20°.

image

But from elementary trigonometry

cos 3θ = 4 cos3 θ − 3 cos θ

hence image

Thus, b = 2 cos 20° satisfies b3 − 3b − 1 = 0. The polynomial

p(x) = x3 − 3x − 1

is irreducible over image because p(x + 1) = x3 + 3x2 − 3 is irreducible by Eisenstein’s criterion. It follows that image(b) has degree 3 over image, contradicting the requirement (in Theorem 1) that this degree has to be a power of 2. ■

Theorem 4Squaring the circle” by ruler and compass is impossible.

PROOF. If we were able to square the circle by ruler and compass, it would be possible to construct the point image; hence by Theorem 1, image would be a power of 2. But it is well known that π is transcendental over image. By Theorem 3 of Chapter 29, the square of an algebraic element is algebraic; hence image is transcendental. It follows that image is not even a finite extension of image, much less an extension of some degree 2m as required. ■

EXERCISES

† A. Constructible Numbers

If O and I are any two points in the plane, consider a coordinate system such that

image

the interval OI coincides with the unit interval on the x axis. Let image be the set of real numbers such that image iff the point (a, 0) is constructible from {O, I}. Prove the following:

1 If image, then image and image.

2 If image, then image. (HINT: Use similar triangles. See the accompanying figure.)

image

3 If image, then image. (Use the same figure as in part 2.)

4 If a > 0 and image, then image. (HINT: In the accompanying figure, AB is the diameter of a circle. Use an elementary property of chords of a circle to show that image.)

image

It follows from parts 1 to 4 that image is a field, closed with respect to taking square roots of positive numbers. image is called the field of constructible numbers.

5 image.

6 If a is a real root of any quadratic polynomial with coefficients in image, then image. (HINT: Complete the square and use part 4.)

† B. Constructible Points and Constructible Numbers

Prove each of the following:

1 Let image be any set of points in the plane; (a, b) is constructible from image iff (a, 0) and (0, b) are constructible from image.

2 If a point P is constructible from {O,I} [that is, from (0, 0) and (1, 0)], then P is constructible from image × image.

# 3 Every point in image × image is constructible from {O, I}. (Use Exercise A5 and the definition of image.)

4 If a point P is constructible from image × image, it is constructible from {O, I}.

By combining parts 2 and 4, we get the following important fact: Any point P is constructible from image × image iff P is constructible from {O,I}. Thus, we may define a point to be constructible iff it is constructible from {O, I}.

5 A point P is constructible iff both its coordinates are constructible numbers.

† C. Constructible Angles

An angle α is called constructible iff there exist constructible points A, B, and C such that ∠LABC= α. Prove the following:

1 The angle α is constructible iff sin α and cos α are constructible numbers.

2 cos image iff sin image.

3 If cos α, cos image, then cos (α + ß), cos image.

4 cos image iff cos image.

5 If α and β are constructible angles, so are image, and for any positive integer n.

# 6 The following angles are constructible: image.

7 The following angles are not constructible: 20°; 40°, 140°. (HINT: Use the proof of Theorem 3.)

D. Constructible Polygons

A polygon is called constructible iff its vertices are constructible points. Prove the following:

# 1 The regular n-gon is constructible iff the angle 2π/n is constructible.

2 The regular hexagon is constructible.

3 The regular polygon of nine sides is not constructible.

† E. A Constructible Polygon

We will show that 2π/5 is a constructible angle, and it will follow that the regular pentagon is constructible.

1 If r = cos k + i sin k is a complex number, prove that 1/r = cos ki sin k. Conclude that r + 1/r = 2 cos k.

By de Moivre’s theorem,

image

is a complex fifth root of unity. Since

x5 − 1 = (x − 1)(x4 + x3 + x2 + x + 1)

ω is a root of p(x) = x4 + x3 + x2 + x + 1.

2 Prove that ω2 + ω + 1 + ω1 + ω2 = 0.

3 Prove that

image

(HINT: Use parts 1 and 2.) Conclude that cos (2π/5) is a root of the quadratic 4x2 − 2x − 1.

4 Use part 3 and A6 to prove that cos (2π/5) is a constructible number.

5 Prove that 2π/5 is a constructible angle.

6 Prove that the regular pentagon is constructible.

† F. A Nonconstructible Polygon

By de Moivre’s theorem,

image

is a complex seventh root of unity. Since

x7 − 1 = (x − 1)(x6 + x5 + x4 + x3 + x2 + x + 1)

ω is a root of x6 + x5 + x4 + x3 + x2 + x + 1.

1 Prove that ω3 + ω2 + ω + 1 + ω1 + ω2 + ω3 = 0.

2 Prove that

image

(Use part 1 and Exercise El.) Conclude that cos (2π/7) is a root of 8x3 + 4x2 − 4x− 1.

3 Prove that 8x3 + 4x2 − 4x − 1 has no rational roots. Conclude that it is irreducible over image.

4 Conclude from part 3 that cos (2π/7) is not a constructible number.

5 Prove that 2π/7 is not a constructible angle.

6 Prove that the regular polygon of seven sides is not constructible.

G. Further Properties of Constructible Numbers and Figures

Prove each of the following:

1 If the number a is a root of an irreducible polynomial p(x) ∈ image[x] whose degree is not a power of 2, then a is not a constructible number.

# 2 Any constructible number can be obtained from rational numbers by repeated addition, subtraction, multiplication, division, and taking square roots of positive numbers.

3 image is the smallest field extension of image closed with respect to square roots of positive numbers (that is, any field extension of image closed with respect to square roots contains image). (Use part 2 and Exercise A.)

4 All the roots of the polynomial x4 − 3x2 + 1 are constructible numbers.

A line is called constructible if it passes through two constructible points. A circle is called constructible if its center and radius are constructible.

5 The line ax + by + c = 0 is constructible if image.

6 The circle x2 + y2 + ax + by + c = 0 is constructible if image.