﻿ Distributions - Data-Based and Statistical Reasoning - MCAT Physics and Math Review ﻿

## Chapter 12: Data-Based and Statistical Reasoning

### 12.2 Distributions

Often a single statistic for a data set is insufficient for a detailed or relevant analysis. In this case, it is useful to look at the overall shape of the distribution as well as specifics about how that shape impacts our interpretation of the data. The shape of a distribution will impact all of the measures of central tendency that we have already discussed, as well as some measures of distribution, which we will examine later.

NORMAL DISTRIBUTIONS

In statistics, we most often work with a normal distribution, shown in Figure 12.1. Even when we know that this is not quite the case, we can use special techniques so that our data will approximate a normal distribution. This is very important because the normal distribution has been “solved” in the sense that we can transform any normal distribution to a standard distribution with a mean of zero and a standard deviation of one, and then use the newly generated curve to get information about probability or percentages of populations. The normal distribution is also the basis for the bell curve seen in many scenarios, including exam scores on the MCAT.

Figure 12.1. The Normal Distribution The mean, median, and mode are at the center of the distribution. Approximately 68% of the distribution is within one standard deviation of the mean, 95% within two, and 99% within three.

KEY CONCEPT

The normal distribution and its counterpart, the standard distribution, are the basis of most statistical testing on the MCAT. In the normal distribution, all of the measures of central tendency are the same.

SKEWED DISTRIBUTIONS

Distributions are not always symmetrical. A skewed distribution is one that contains a tail on one side or the other of the data set. On the MCAT, skewed distributions are most often tested by simply identifying their type. This is often an area of confusion for students because the visualshift in the data appear opposite the direction of the skew. A negatively skewed distribution has a tail on the left (or negative) side, whereas a positively skewed distribution has a tail on the right (or positive) side. Because the mean is more susceptible to outliers than the median, the mean of a negatively skewed distribution will be lower than the median, while the mean of a positively skewed distribution will be higher than the median. These distributions, and their measures of central tendency, are shown in Figure 12.2.

Figure 12.2. Skewed Distributions (a) Negatively skewed distribution, with mean lower than median; (b) Positively skewed distribution, with mean higher than median.

KEY CONCEPT

The direction of skew in a sample is determined by its tail, not the bulk of the distribution.

BIMODAL DISTRIBUTIONS

Some distributions have two or more peaks. A distribution containing two peaks with a valley in between is called bimodal, as shown in Figure 12.3. It is important to note that a bimodal distribution, strictly speaking, might have only one mode if one peak is slightly higher than the other. The presence of two peaks of different sizes does not discount a distribution from being considered bimodal. If there is sufficient separation of the two peaks, or a sufficiently small amount of data within the valley region, bimodal distributions can often be analyzed as two separate distributions. On the other hand, bimodal distributions do not have to be analyzed as two separate distributions either; the same measures of central tendency and measures of distribution can be applied to them as well.

Figure 12.3. Bimodal Distribution

MCAT Concept Check 12.2:

Before you move on, assess your understanding of the material with these questions.

1.    How do the mean, median, and mode compare for a right-skewed distribution?

2.    Can data that do not follow a normal distribution be analyzed with measures of central tendency and measures of distribution? Why or why not?

3.    What is the difference between normal or skewed distributions, and bimodal distributions?

﻿

﻿