Basic Math and Pre-Algebra

PART 4. The State of the World

 

CHAPTER 20. Measures of Center and Spread

 

The Spread

It’s good to know where the center of your data is. That tells you the average value. But only knowing the center of your data is like only knowing the center of a circle. You know where it is, but you don’t really know what it looks like, because you don’t know how big it is. You can’t draw the circle until you know the center and the radius, and you don’t have a good picture of your data until you know the center and the spread.

Measures of spread tell you whether all the numbers are clumped up close to the average or whether they’re spread all over the place. If, over the course of the semester, you earned test scores of 69, 70, 73, 74, and 74, you’d have a mean score of 72 and a median score of 73. If you earned test scores of 43, 61, 73, 85 and 98, you’d also have a mean score of 72 and a median of 73, but the two sets of scores give very different pictures of how your semester went. Knowing how spread out the numbers are can also be important information.

Range

The simplest measure of the spread is called the range. It’s just the difference between the highest value and the lowest value. Those test scores of 69, 70, 73, 74, and 74 have a range of 74 - 69 = 5, but the scores of 43, 61, 73, 85, and 98 have a range of 98 - 43 = 55. The much larger range for the second set tells you that the numbers varied a great deal. The smaller range says that the numbers clumped up fairly close to the mean or median.

Interquartile Range

The interquartile range, or IQR, is similar to the range, as you can tell from its name. The other part of the name, interquartile, means between the quartiles. To find the range, you subtract the minimum value from the maximum value. To find the interquartile range, you subtract the first quartile from the third quartile (Q3 - Q1).

Look back at the data about number of books read that you used to find quartiles.

Number of books read last year: 16, 23, 13, 24, 25, 16, 17, 28, 19, 14, 12, 22, 13, 24, 15, 26, 27, 18, 29

This data set has a minimum of 12 and a maximum of 29, for a range of 17. You found Q1 = 15, median = 19, and Q3 = 25. The interquartile range is Q3 - Q1 = 25 - 15 = 10.

The reason you sometimes want to use the interquartile range instead of the range is that the very high and very low values in your data often straggle far away from the other data. That exaggerates the range. The IQR cuts off those straggly parts but still gives you a sense of the spread.

Standard Deviation

The third commonly used measure of spread is the one that’s more complicated to find. It’s called the standard deviation, and like the range and the IQR, the bigger it is, the more spread out the data are. The standard deviation tells you how much the other numbers in the data set vary from the average. For this reason, it is often paired up with the mean; for example, you might hear that a data set has a mean of 42 with a standard deviation of 3. A low standard deviation tells you that the numbers in the data set are close to the mean, while a high standard deviation indicates that the numbers in a data set are far from the mean.

Understanding the standard deviation is harder than understanding the range, but here’s a way to think about what the mean and standard deviation tell you.

The mean tells you how to locate the center of the data set.

The IQR tells you where the middle 50 percent of the data is.

The range tells you where 100 percent of the data is.

The standard deviation breaks the range up into sections, letting you gauge how far from the mean another value is.

The standard deviation is like a ruler, measuring how from the center a value falls.

So what is the standard deviation? The deviation part refers to how far from the mean each number in the data set is. That’s the simple piece. The standard part refers to the more complicated work that’s done to avoid or eliminate things that could confuse the information.

Calculators and computers often do the work of finding a standard deviation, especially for large sets of numbers, but let’s go through the steps once, just so you know what’s going on. Let’s use the test scores of 43, 61, 73, 85, and 98 from the earlier example about spread.

The first step in finding a standard deviation is to find the mean. You know from the earlier example that the mean of this data is 72. The next step is to subtract that mean from each number.

Number

Number — Mean

43

43 - 72 = -29

61

61 - 72 = -11

73

73 - 72 = 1

85

85 - 72 = 13

98

98 - 72 = 26

The basic idea is to average these deviations, but if you add them up right now, the negative numbers and the positive numbers will cancel each other out. You could take the absolute value of each deviation, add the absolute values, and divide by the number of them you have. That would give you the mean absolute deviation. But the standard deviation calculation squares the deviations first. Like the absolute value, that makes everything positive.

Number

Number — Mean

Squared

43

43 - 72 = -29

(-29)2 = 841

61

61 - 72 = -11

(-11)2 = 121

73

73 - 72 = 1

12 = 1

85

85 - 72 = 13

132 = 169

98

98 - 72 = 26

262 = 676

Now you add up all those squares and get 1,808. Next, divide. If your data set comes from the whole population of interest, divide by the number of values. If you’re working with just a sample from a larger group, divide by one less than that. In this data set you have 5 values, so if this set is everyone’s test, divide by 5. If it’s only a sample of the test takers, divide by 4. 1,808 ÷ 5 = 361.6 or 1,808 ÷ 4 = 452. The last step is to undo the squaring by taking the square root of 361.6 or 452. √/361.6 ≈ 19.02 or √452 ≈ 21.26. This set of test scores has a mean of 72 and a standard deviation of either 19.02 or 21.26. That’s a pretty big standard deviation, either way, telling you those test scores are very spread out.

CHECK POINT

21. Find the range of {34, 54, 78, 92, 101}

22. Find the range of {3, 4, 5, 4, 7, 8, 9, 2, 10, 1}

23. Find the interquartile range of {3, 4, 5, 4, 7, 8, 9, 2, 10, 17, 32, 34, 36, 38}

24. Find the IQR of {2, 2, 2, 3, 3, 4, 4, 4, 4, 34, 54, 78, 92, 101}

25. Find the standard deviation of the test scores {69, 70, 73, 74, and 74}.

The Least You Need to Know

• Measures of center and spread help you make sense of large amounts of data.

• The mean is the average of a set of numbers, the median is the middle number in a set of numbers, and the mode is most common number in a set of numbers.

• Numbers like quartiles and percentiles let you place one number within the group.

• Range, IQR, and standard deviation tell you how widely spread the information is.