MCAT Physics and Math Review

Chapter 12: Data-Based and Statistical Reasoning

12.5 Statistical Testing

Hypothesis testing and confidence intervals allow us to draw conclusions about populations based on our sample data. Both are interpreted in the context of probabilities, and what we deem to be an acceptable risk of error.


Hypothesis testing begins with an idea about what may be different between two populations. We have a null hypothesis, which is always a hypothesis of equivalence. In other words, the null hypothesis says that two populations are equal, or that a single population can be described by a parameter equal to a given value. The alternative hypothesis may be nondirectional (that the populations are not equal) or directional (for example, that the mean of population A is greater than the mean of population B).

The most common hypothesis tests are z- or t-tests, which rely on the standard distribution or the closely related t-distribution. From the data collected, a test statistic is calculated and compared to a table to determine the likelihood that that statistic was obtained by random chance (under the assumption that our null hypothesis is true). This is our p-value. We then compare our p-value to a significance level (α); 0.05 is commonly used. For a directional test, if the p-value is greater than α, then we fail to reject the null hypothesis, which means that there is not a statistically significant difference between the two populations. If the p-value is less than α, then we reject the null hypothesis and state that there is a statistically significant difference between the two groups. If the alternative hypothesis is not directional, we compare the p-value to  instead. Again, when the null hypothesis is rejected, we state that our results are statistically significant.

The value of α is the level of risk that we are willing to accept for incorrectly rejecting the null hypothesis. This is also called a type I error. In other words, a type I error is the likelihood that we report a difference between two populations when one does not actually exist. A type II erroroccurs when we incorrectly fail to reject the null hypothesis. In other words, a type II error is the likelihood that we report no difference between two populations when one actually exists. The probability of a type II error is sometimes symbolized by β. The probability of correctly rejecting a false null hypothesis (reporting a difference between two populations when one actually exists) is referred to as power, and is equal to 1 − β. Finally, the probability of correctly failing to reject a true null hypothesis (reporting no difference between two populations when one does not exist) is referred to as confidence. These conditions are summarized in Table 12.1.


Truth About the Population


H0 true (no difference)

Ha true (difference exists)


Conclusion Based on Sample

Reject H0

Type I error (α)

Power (1 −β)


Fail to reject H0


Type II error (β)


Table 12.1. Results of Hypothesis Testing



Confidence intervals are essentially the reverse of hypothesis testing. With a confidence interval, we determine a range of values from the sample mean and standard deviation. Rather than finding a p-value, we begin with a desired confidence level (95% is standard) and use a table to find its corresponding z- or t-score. When we multiply the z- or t-score by the standard deviation, and then add and subtract this number from the mean, we create a range of values. For example, consider a population for which we wish to know the mean age. We draw a sample from that population and find that the mean of the sample is 30, with a standard deviation of 3. If we wish to have 95% confidence, the corresponding z-score (which would be provided on Test Day) is 1.96. Thus, the range is 30 − (3)(1.96) to 30 + (3)(1.96) = 24.12 to 35.88. We can then report that we are 95% confident that the true mean age of the population from which this sample is drawn is between 24.12 and 35.88.

MCAT Concept Check 12.5:

Before you move on, assess your understanding of the material with these questions.

1.    How do hypothesis tests and confidence intervals differ?

·        Hypothesis tests:

·        Confidence intervals:

2.    If the p-value is greater than α in a given statistical test, what is the outcome of the test?

3.    How is the p-value calculated during a hypothesis test?

4.    True or False: Power is the probability of correctly rejecting the null hypothesis.