Basic Math and Pre-Algebra

PART 4. The State of the World

 

CHAPTER 20. Measures of Center and Spread

 

When researchers have a question, they perform lots of experiments or pose their question to many people and then they record all the results or responses. All that data can provide lots of information, but there may be so much data that it’s difficult to make sense of it. It can turn into just a flood of numbers.

To make sense of information, you often need ways to summarize it and describe the patterns in it. Statistics are numbers that summarize collections of data or information. They help you to draw conclusions about the data.

In this chapter, you’ll look at the statistics that give you the most important information. You’ll find the measures of center, the numbers that tell you about the average or typical value. You’ll also find the numbers that divide the data into groups, allowing you to compare different results. For the big picture, you’ll talk about how spread out the data are, using a few simple numbers (and one not-so-simple one).

 

The Centers

 

One of the ways you can represent a data set is by identifying a central tendency. A central tendency is the center value or a typical value of a particular data set. Identifying this value gives you a sense of the result you might get if you did the experiment again or asked the question again. Knowing this measure of center begins to give you a sense of the typical result, what you can expect. That expectation is approximate, of course. Every time you do the experiment or ask the question you will probably get a different value, but the more you do that, the better your sense of what to expect will become.

There are three different measures of center in common use. They’re called the mean, the median, and the mode. The mean requires the most calculation, but unless you have an extremely large set of numbers, the calculation isn’t difficult. The median only asks you to put numbers in order and the mode only requires you to count.

 

Mean

The mean is the number most people think of when you say “average.” It finds the center by balancing out the highs and lows of the numbers. The mean is found by adding all the data items and dividing by the number of items.

 

DEFINITION

The mean of a set of data is a measure of center found by adding all the data values and dividing by the number of values.

 

If you take three tests and earn grades of 84, 91, and 77, you find your average grade, or mean grade, by first adding up your three test scores. 84 + 91 + 77 = 252. Then you divide that total by 3 because it represents three tests. 252 ÷ 3 = 84. Your mean score is 84. The mean will not always be the same as one of the values, but it will be in the middle of the values. Your high score of 91 and your low score of 77 balance each other out, and the average ends up being the same as your middle score of 84.

If one or more of the values you’re averaging are much higher or much lower than most of the group, the mean will be pulled toward that extreme value. Suppose you accidently copied that grade of 77 as just a 7. That would make that score significantly lower than your other grades. When you found the average, you’d get (84 + 91 + 7) ÷ 3 = 182 ÷ 3 = 60.67. That gigantic drop in your average is caused by that extremely low value.

Here’s an example with a larger data set and larger numbers. Calculate carefully to find the mean amount of land in parks and wildlife areas in 2002 in the states shown below.

 

Land in Rural Parks and Wildlife Areas 2002 (1,000 acres)

Michigan

1,436

Wisconsin

1,000

Minnesota

2,959

Ohio

372

Indiana

264

Illinois

432

Iowa

327

Missouri

649

 

First add the values given for each of the eight states. The total should be 7,439. Then divide by 8, because you added up the acreage for eight states. 7,439 ÷ 8 = 929.875. Keep in mind that the numbers you averaged are in thousands of acres, so the mean you calculated is not actually 929.875 acres, but 929.875 thousand acres or 929,875 acres.

 

CHECK POINT

Find the mean of each set of data

1. A = {2, 2, 2, 3, 3, 4, 4, 4, 4}

2. B = {34, 54, 78, 92, 101}

3. C = {3, 4, 5, 4, 7, 8, 9, 2, 10, 1}

4. D = {32, 34, 36, 38}

5. E = {2, 2, 3, 4, 5}

 

Median

The median is the middle value when a set of data has been ordered from smallest to largest or largest to smallest. You’ve probably heard the word median used in other situations. In geometry, a median of a triangle connects a vertex to the midpoint of the opposite side. The strip of grass in the middle of a highway that divides the lanes moving in one direction from the lanes in the other direction is also called the median. Medians are always about the middle.

 

DEFINITION

The median of a set of data is the middle value when the data are ordered from smallest to largest. If the data set contains an even number of values, the median is the average of the two middle values.

 

To find a median, you just need to put the data in order and find the value that falls exactly in the middle of the list. If there is an even number of data points, and two numbers seem to be in the middle, the mean of those two numbers is the median. If the data set is very large, sorting and counting to find the middle can be tedious, and that’s a good time to use a computer to help with the task. But for smaller data sets, finding the median is fairly simple.

Suppose a class of 10 students earned the grades below on an exam.

78, 59, 92, 82, 74, 97, 63, 75, 66, 88

To find the median grade, put the grades in order. Usually, people sort from low to high, but high to low will work, too.

59, 63, 66, 74, 75, 78, 82, 88, 92, 97

There are 10 grades, so the median will be the average of the fifth and sixth grades. Counting in from the low end, the fifth grade is 75 and the sixth is 78. The median is (75 + 78) ÷ 2 = 153 ÷ 2 = 76.5. The median grade is 76.5.

Earlier you found the mean number of acres of land in parks and wildlife areas for eight states in 2002. To find the median of the same data, first put the list in order by acreage.

 

Land in Rural Parks and Wildlife Areas 2002 (1,000 acres)

Minnesota

2,959

Michigan

1,436

Wisconsin

1,000

Missouri

649

Illinois

432

Ohio

372

Iowa

327

Indiana

264

 

The median will be the average of the fourth and fifth values. (649 + 432) ÷ 2 = 1081 ÷ 2 = 540.5 thousand acres or 540,500. Remember that the mean was 929,875 acres. The mean is larger than the median because it’s pulled toward the large acreage for Minnesota. The median doesn’t get pulled in the same way, which is why statisticians say that the median is resistant.

 

CHECK POINT

Find the median of each set of data.

6. A = {2, 2, 2, 3, 3, 4, 4, 4, 4}

7. B = {34, 54, 78, 92, 101}

8. C = {3, 4, 5, 4, 7, 8, 9, 2, 10, 1}

9. D = {32, 34, 36, 38}

10. E = {2, 2, 3, 4, 5}

 

Mode

Most people recognize the word mode, but not always from math. You’ve probably heard the phrase “a la mode” used in describing a dessert. It really doesn’t mean “with ice cream.” It actually means “according to the fashion.” It was fashionable to add ice cream to a dessert, and doing so became quite common.

 

DEFINITION

The mode of a set of data is the value that occurs most frequently.

 

In statistics, the mode is the most fashionable value, the one that occurs most often. When you look at a set of data and see a value that is repeated frequently, that value may be the mode of the data set. If more than one value repeats, the one that repeats most often is the mode.

In the large data set below, you’ll find several values that occur twice: 9.8, 10.8, 14.7, and 18.6. There is one value, 17.7, that occurs three times. The mode is the most common value, the one that occurs most frequently, so in this data set, the mode is 17.7.

 

6.9

8.7

9.7

10.7

11.5

13.3

14

14.7

17.3

18.6

7.8

8.9

9.8

10.8

12.1

13.4

14.1

14.9

17.6

18.6

8.1

9

9.8

10.8

12.4

13.6

14.4

15.4

17.7

19

8.2

9.4

10.3

10.9

13

13.7

14.6

15.9

17.7

19.9

8.5

9.6

10.6

11.4

13.3

13.8

14.7

16.8

17.7

22.7

 

Not every data set will have a mode. Many are made up of values that are all different. And some will have more than one value that repeats the same number of times, so you can say they have more than one mode. Sometimes that’s interesting information, but often it just means that there isn’t a really common value.

The data set below shows the scores (out of 18) earned by 40 subjects in an experiment.

 

Test Scores

14

14

8

9

13

11

13

9

10

9

11

15

13

12

13

9

16

12

15

14

8

12

17

15

15

16

11

16

14

16

13

12

17

7

13

8

16

14

14

10

 

To find the mode, it helps to sort the data, so that duplicate values are together and easier to count. Here’s the same data sorted from low to high.

 

Test Scores

7

9

11

12

13

14

15

16

8

9

11

12

13

14

15

16

8

9

11

13

13

14

15

16

8

10

12

13

14

14

16

17

9

10

12

13

14

15

16

17

 

With the data sorted, you can see that there are many repeated values, but you want to find the most common value. It turns out that this data set is bimodal. That means it has two modes: 13 and 14. Each of those scores occurs six times. The fact that they fall together is a strong indicator that these are typical values.

 

CHECK POINT

Find the mode of each set of data.

11. A = {2, 2, 2, 3, 3, 4, 4, 4, 4}

12. B = {34, 54, 78, 92, 101}

13. C = {3, 4, 5, 4, 7, 8, 9, 2, 10, 1}

14. D = {32, 34, 36, 38}

15. E = {2, 2, 3, 4, 5}