Confidence Intervals and Introduction to Inference - Review the Knowledge You Need to Score High

5 Steps to a 5 AP Statistics 2017 (2016)

STEP 4

Review the Knowledge You Need to Score High

CHAPTER 11 Confidence Intervals and Introduction to Inference

IN THIS CHAPTER

Summary: In this chapter we begin our formal study of inference by introducing the t distribution (as an adjustment to z ) and talk about estimating a population parameter. We will learn about confidence intervals, a way of identifying a range of values that we think might contain our parameter of interest. We will develop intervals to estimate a single population mean, a single population proportion, the difference between two population means, and the difference between two population proportions. We will also be introduced to the logic behind significance testing as well as to the types of errors you have to worry about when testing. There”s a lot in this chapter, and you need to internalize most of it.

Key Ideas

Estimation

Confidence Intervals

t Procedures

Choosing a Sample Size for a Confidence Interval

P -Value

Statistical Significance

Hypothesis-Testing Procedure

Errors in Hypothesis Testing

The Power of a Test

Estimation and Confidence Intervals

As we proceed in statistics, our interest turns to estimating unknown population values. We have previously described a statistic as a value that describes a sample and a parameter as a value that describes a population. Now we want use a statistic as an estimate of a parameter . We know that if we draw multiple samples and compute some statistic of interest, say , that we will likely get different values each time even though the samples are all drawn from a population with a single mean, μ . What we now do is to develop a process by which we will use our estimate to generate a range of likely population values for the parameter. The statistic itself is called a point estimate , and the range of likely population values from which we might have obtained our estimate is called a confidence interval .

example: We do a survey of a sample of students from a school, and find that 42% of the sample plan to vote for Normajean for student body treasurer. That is, = 0.42. Based on this, we generate an interval of likely values (the confidence interval) for the proportion of all students at the school who will vote for Normajean and find that between 38% and 46% of the students are likely to vote for Normajean. The interval (0.38, 0.46) is a confidence interval for the proportion of all students at this school who will vote for Normajean.

Note that saying a confidence interval is likely to contain the true population value is not to say that it necessarily does. It may or may not—we will see ways to quantify just how “confident” we are in our interval.

In this chapter, we will construct confidence intervals for a single mean, the difference between two means, a single proportion, and the difference between two proportions. Our ability to construct confidence intervals depends on our understanding of the sampling distributions for each of the parameters. In Chapter 10 , we discussed the concept of sampling distribution for sample means and sample proportions. Similar arguments exist for the sampling distributions of the difference between two means or the difference between two proportions.

t Procedures

When we discussed the sampling distribution of in Chapter 10 , we knew the population mean and standard deviation. In this chapter, we are estimating the population mean from a sample. If we knew the population standard deviation, we would probably also know the population mean and would not need to estimate μ ? Because we do not know the population standard deviation s , we use the sample standard deviation σ as an estimate of σ .

When we estimate a standard deviation of a sampling distribution from data, we call the estimator the standard error (some texts define the standard error as the standard deviation of the sampling distribution rather than an estimate of the standard deviation based on our data).

In this case, then,

is used to estimate

We will use the term standard error from now on as we study inference whenever we are referring to the standard deviation of the sampling distribution.

When the population is approximately normal, the sampling distribution of is approximately normal with mean μ and standard deviation . The sample mean, on average is equal to the population mean, so using the sample mean to estimate the population mean is, on average, correct. But since we don”t know σ , we also have to use the sample standard deviation to estimate the population standard deviation. And that is not, on average correct. s tends to underestimate σ . To compensate for this, we need to adjust z , and the adjusted value is called t . The t statistic is given by

This statistic approximately follows a t distribution if the following are true.

The population from which the sample was drawn is approximately normal, or the sample is large enough (rule of thumb:n ≥ 30).
The sample is an SRS from the population.

There is a different t distribution for each n . The distribution is determined by the number of degrees of freedom , df = n – 1. We will use the symbol t (k ) to identify the t distribution with k degrees of freedom. The t -distribution is symmetric and mound-shaped, but it has heavier tails than a normal distribution.

As n increases, the necessary adjustment gets smaller so the t distribution gets closer to the normal distribution. In fact, if you look at the bottom row of Table B, you”ll see that the values of t for infinitely many degrees of freedom are the same as those for the normal distribution. We can see this in the following graphic:

The table used for t values is set up differently than the table for z . In Table A in the Appendices, the marginal entries are z -scores, and the table entries are the corresponding areas under the normal curve to the left of z . In the ttable, Table B, the left-hand column is degrees of freedom, the top margin gives upper tail probabilities, and the table entries are the corresponding critical values of t required to achieve the probability. In this book, we will use t * (or z *) to indicate critical values.

example: For 12 df, and an upper tail probability of 0.05, we see that the critical value of t is 1.782 (t * = 1.782). For an upper tail probability of 0.02, the corresponding critical value is 2.303 (t * = 2.303).

example: For 1000 df, the critical value of t for an upper tail probability of 0.025 is 1.962 (t * = 1.962). This is very close to the critical z -value for an upper tail probability of 0.025, which is 1.96 (z * = 1.96).

Calculator Tip: The TI-83 and early versions of the TI-84 calculator have no function for invT analogous to invNorm . The operating system of the TI-84 can be updated, but if you have a TI-83 you are pretty well restricted to using Table B to find a value of t ^* . However, if you have a newer OS for a TI-84, there is an invT function in the DISTR menu that makes it quite easy to find t ^* . It works similarly to invNorm , only you need to indicate the number of degrees of freedom. The syntax is invT(area of the left of t ^* , df) . In the first example above, then, t ^* = invT(0.95, 12) = 1.782 and t ^* = invT(0.98, 12) = 2.303. In the second example, t ^* = invT(0.975,1000) = 1.962 .

General Form of a Confidence Interval

A confidence interval is composed of two parts: a point estimate of a population value and a margin of error. We specify a level of confidence to communicate how certain we are that the interval contains the true population parameter.

A level C confidence interval has the following form: (estimate) ± (margin of error). In turn, the margin of error for a confidence interval is composed of two parts: the critical value of z or t (which depends on the confidence level C ) and the standard error. Hence, all confidence intervals take the form:

(estimate) ± (margin of error) = (estimate) ± (critical value)(standard error).

A t confidence interval for μ would take the form:

t * is dependent on C , the confidence level; s is the sample standard deviation; and n is the sample size.

The confidence level is often expressed as a percent: a 95% confidence interval means that C = 0.95, or a 99% confidence interval means that C = 0.99. Although any value of C can be used as a confidence level, typical levels are 0.90, 0.95, and 0.99.

Important: When we say that “We are 95% confident that the true population value lies in an interval,” we mean that the process used to generate the interval will capture the true population value 95% of the time. We are not making any probability statement about the interval. Our “confidence” is in the process that generated the interval. We do not know whether the interval we have constructed contains the true population value or not—it either does or it doesn”t. All we know for sure is that, on average, 95% of the intervals so constructed will contain the true value.

Exam Tip: For the exam, be very, very clear on the discussion above. Many students seem to think that we can attach a probability to our interpretation of a confidence interval. We cannot because probability refers only to repeatable future random events. The interval has already been created, so talking about a probability doesn”t make sense.

example: Floyd told Betty that the probability was 0.95 that the 95% confidence interval he had constructed contained the mean of the population. Betty corrected him by saying that his interval either does contain the value (P = 1) or it doesn”t (P = 0). This interval could be one of the 95 out of every 100 on average that does contain the population mean, or it might be one out of the 5 out of every 100 that does not. Remember that probability values apply to the expected relative frequency of future events, not events that have already occurred.

example: Find the critical value of t required to construct a 99% confidence interval for a population mean based on a sample size of 15.

solution: To use the t distribution table (Table B in the Appendixes), we need to know the upper-tail probability.

Looking in the row for df = 15 – 1 = 14, and the column for 99% confidence we find t * = 2.977. Note that the table is set up so that if you look at the bottom of the table and find 99%, you are in the correct column.

Because C = 0.99, and confidence intervals are two sided, the upper-tail probability is

Using the newer version of the TI-84, the solution is given by invT(0.995,14) = 2.97 7 .

example: Find the critical value of z required to construct a 95% confidence interval for a population proportion.

solution: We are reading from Table A, the table of Standard Normal Probabilities. Remember that table entries are areas to the left of a given z -score. With C = 0.95, we want

in each tail, or 0.975 to the left if z *. Finding 0.975 in the table, we have z * = 1.96. On the TI-83/84, the solution is given by invNorm(0.975)=1.960.

Confidence Intervals for Means and Proportions

In the previous section we discussed the concept of a confidence interval. In this section, we get more specific by actually constructing confidence intervals for each of the parameters under consideration. The chart below lists each parameter for which we will construct confidence intervals, the conditions under which we are justified in constructing the interval, and the formula for actually constructing the interval.

Special note concerning the degrees of freedom for the sampling distribution of the difference of two means : In most situations, a conservative , and usually acceptable, approach for determining the required number of degrees of freedom is to let the number of degrees of freedom equal n ₁ – 1 or n ₂ – 1, whichever is smaller. We will abbreviate this as df = min{n ₁ – 1, n ₂ – 1}. This is “conservative” in the sense that it will give a smaller number of degrees of freedom than other methods, which translates to a larger margin of error. It will be easiest to use a calculator to do the interval, and report the degrees of freedom given by the calculator.

When the confidence interval is constructed by calculator or computer (that”s the “computed by software” notation in the chart), the degrees of freedom will be computed using the following expression:

You don”t need to, and probably don”t want to do this computation by hand, but you could! Note that this technique usually results in a noninteger number of degrees of freedom.

In practice, since most people will be constructing a two-sample confidence interval using a calculator, the second method above (referred to as the “computed by software” method in the chart on the previous page) is acceptable. Just be sure to report the degrees of freedom as given by the calculator so that the method you say you are using matches your computation.

example: An airline is interested in determining the average number of unoccupied seats for all of its flights. It selects an SRS of 81 flights and determines that the average number of unoccupied seats for the sample is 12.5 seats with a sample standard deviation of 3.9 seats. Construct a 95% confidence interval for the true number of unoccupied seats for all flights.

solution: The problem states that the sample is an SRS. The large sample size justifies the construction of a one-sample confidence interval for the population mean. For a 95% confidence interval with df = 81 – 1 = 80, we have, from Table B, t * = 1.990. We have .

Note: If the problem had stated that n = 80 instead of 81, we would have had df = 80 – 1 = 79. There is no entry in Table B for 79 degrees of freedom. In this case we would have had to round down and use df = 60, resulting in t * = 2.000 and an interval of . The difference isn”t large, but the interval is slightly wider. (For the record, we note that the value of t * for df = 79 is given by the TI-84 as invT(0.975,79)=1.99045. )

You can use the STAT TESTS TInterval function on the TI-83/84 calculator to find a confidence interval for a population mean (a confidence interval for a population mean is often called a “one-sample” t interval). It”s required to identify your procedure, either by name or by formula, as well as reporting the calculator answer. And don”t forget to show that you have checked the conditions needed to construct the interval.

example: Interpret the confidence interval from the previous example in the context of the problem.

solution: We are 95% confident that the mean number of unoccupied seats for all the airline”s flights is between 11.6 and 13.4 seats.

For large sample confidence intervals utilizing z -procedures, it is probably worth memorizing the critical values of z for the most common C levels of 0.90, 0.95, and 0.99. They are:

example: Brittany thinks she has a bad penny because, after 150 flips, she counted 88 heads. Find a 99% confidence interval for the true proportion of heads for all possible tosses of this coin. Do you think the coin is bad?

solution: First we need to check to see if using a z interval is justified.

The tosses of the coin can be treated as a random sample of coin tosses. Both n and n (1 – ) are greater than or equal to 10 (or 5). And we have true independence because we are not sampling without replacement from a finite population, we can construct a 99% z interval for the population proportion:

We are 99% confident that the true proportion of heads for this coin is between 0.484 and 0.69. If the coin were fair, we would expect, on average, 50% heads. Since 0.50 is in the interval, it is a plausible population value for this coin. We do not have convincing evidence that Brittany”s coin is bad.

Generally, you should use t procedures for one- or two-sample problems (those that involve means) unless you are given the population standard deviation(s) and z -procedures for one- or two-proportion problems.

Calculator Tip: The STAT TESTS menu on your TI-83/84 contains all of the confi-dence intervals you will encounter in this course: ZInterval (rarely used unless you know σ); TInterval (for a population mean, “one-sample”); 2-SampZInt (rarely used unless you know both σ₁ and σ₂ ); 2-SampTInt (for the difference between two population means); 1-PropZInt (for a single population proportion); 2-PropZInt (for the dif-ference between two population proportions); and LinRegTInt (see Chapter 13 , newer TI-84s only). All except the last of these are covered in this chapter.

Exam Tip: There are three steps to a confidence interval: Check conditions and identify the procedure, compute the interval, interpret the interval in context. The question may not specifically ask for all three steps, but they are always required unless specifically stated otherwise.

example: The following data were collected as part of a study. Construct a 90% confidence interval for the true difference between the means (μ ₁ – μ ₂ ). Does it seem likely the difference in the sample means indicates that there is a difference between the population means? The samples were SRSs from independent, approximately normal populations.

solution: Since we do not know the standard deviation for either population, we need to use a two-sample t interval. The conditions necessary for using this interval are given in the problem: SRSs from independent, approximately normal populations.

The 90% confidence interval is

Using t * for 35.9994 df as reported by the calculator.

(Note: You can calculate the interval on the calculator, and then use InvT to find t *. But that isn”t actually necessary. You do need to report the degrees of freedom! Writing all the stuff above shows clear communication, which can benefit you and can save you if you make a calculation error. But writing something incorrect can cost you. At a minimum you must identify the procedure (by name or by formula), report the degrees of freedom, and give the interval.)

We are 90% confident that the difference between the population means lies in the interval from 0.227 to 5.25. If the true difference between the means is zero, we would expect to find 0 in the interval. Because it isn”t, this interval provides evidence of a difference between the population means.

example: Construct a 95% confidence interval for p ₁ – p ₂ given that n ₁ = 180, n ₂ = 250, ₁ = 0.31, ₂ = 0.25. Assume that these are data from SRSs independently selected from two populations.

solution: 180(0.31) = 55.8, 180(1 – 0.31) = 124.2, 250(0.25) = 62.5, and 250(0.75) = 187.5 are all greater than or equal to 5 so, with what is given in the problem, we have the conditions needed to construct a two-proportion zinterval.

We are 95% confident that the proportion of successes in population 1 is between 2.6 percentage points lower and 14.6 percentage points higher than that of population 2. (But be sure to include context when describing the populations and what proportion you are estimating.)

Sample Size

It is always desirable to select as large a sample as possible when doing research because sample means of larger samples are less variable than sample means of small samples. However, it is often expensive or difficult to draw larger samples so that we try to find the optimum sample size: large enough to accomplish our goals, small enough that we can afford it or manage it. We will look at techniques in this section for selecting sample sizes in the case of a large sample test for a single population mean and for a single population proportion.

Sample Size for Estimating a Population Mean (Large Sample)

The large sample confidence interval for a population mean is given by The margin of error is given by . Let M be the desired maximum margin of error. Then, Solving for n , we have . Using this “recipe,” we can calculate the minimum n needed for a fixed confidence level and a fixed maximum margin of error.

One obvious problem with using this expression as a way to figure n is that we will not know σ , so we need to estimate it in some way. In an exam question, you will almost certainly be provided with an estimate of σ to use in the calculation.

In reality, a researcher would probably be able to utilize some historical knowledge about the standard deviation for the type of data they are examining, as shown in the following example:

example: A machine for inflating tires, when properly calibrated, inflates tires to 32 lbs, but it is known that the machine varies with a standard deviation of about 0.8 lbs. How large a sample is needed in order be 99% confident that the mean inflation pressure is within a margin of error of M = 0.10 lbs?

solution:

. Since n must be an integer and n ≥ 424.49, choose

n = 425. You would need a sample of at least 425 tires.

In this course you will not need to find a sample size for constructing a confidence interval involving t . This is because you need to know the sample size before you can determine t * since there is a different t distribution for each different number of degrees of freedom. For example, for a 95% confidence interval for the mean of a normal distribution, you know that z * = 1.96 no matter what sample size you are dealing with, but you can”t determine t * without already knowing n .

Sample Size for Estimating a Population Proportion

The confidence interval for a population proportion is given by:

The margin of error is

Let M be the desired maximum margin of error. Then,

Solving for n ,

But we do not have a value of until we collect data, so we need a way to estimate . Let P * = estimated value of . Then

There are two ways to choose a value of P *:

Use a previously determined value of. That is, you may already have an idea, based on historical data, about what the value should be close to.
UseP * = 0.5. A result from calculus tells us that the expression

achieves its maximum value when P * = 0.5. Thus, n will be at its maximum if P * = 0.5. If P * = 0.5, the formula for n can more easily be expressed as

It is in your interest to choose the smallest value of n that will match your goals, so any value of P * < 0.5 would be preferable if you have some justification for it.

example: Historically, about 60% of a company”s products are purchased by people who have purchased products from the company previously. The company is preparing to introduce a new product and wants to generate a 95% confidence interval for the proportion of its current customers who will purchase the new product. They want to be accurate within 3%. How many customers do they need to sample?

solution: Based on historical data, choose P * = 0.6. Then

The company needs to sample 1025 customers. Had it not had the historical data, it would have had to use P * = 0.5.

If . You need a sample of at least 1068 customers.

By using P * = 0.6, the company was able to sample 43 fewer customers.

Statistical Significance and P -Value

Statistical Significance

In the first two sections of this chapter, we used confidence intervals to make estimates about population values. In one of the examples, we went further and stated that because 0 was not in the confidence interval, 0 was not a likely population value from which to have drawn the sample that generated the interval. Now, of course, 0 could be the true population value and our interval just happened to miss it. As we progress through techniques of inference (making predictions about a population from data), we often are interested in sample values that do not seem likely under a particular assumption.

We begin my making an assumption about the population. This assumption is called the null hypothesis . A finding or an observation is said to be statistically significant if it is unlikely to have occurred by chance if that null hypothesis is true. That is, if a sample is not one we would expect, it could be because of sampling variability (in repeated sampling from the same population, we will get different sample results even though the population value is fixed), or it could be because the sample came from a different population than we thought. If the result is so far from what we expected that we think something other than chance is operating, then the result is statistically significant.

example: Todd claims that he can throw a football 50 yards. If he throws the ball 50 times and averages 49.5 yards, we have no reason to doubt his claim. If he only averages 30 yards, the finding is statistically significant in that he is unlikely to have a sample average this low if his claim was true.

In the above example, most people would agree that 49.5 was consistent with Todd”s claim (that is, it was a likely average if the true value is 50) and that 30 is inconsistent with the claim (it is statistically significant ). It”s a bit more complicated to decide where between 30 and 49.5 the cutoff is between “reasonably likely” and “unlikely.”

There are some general agreements about how unlikely a finding needs to be, assuming the null hypothesis is true, in order to be significant. Typical significance levels, symbolized by the Greek letter α, are probabilities of 0.1, 0.5, and 0.01. If a finding has a lower probability of occurring than the significance level, then the finding is statistically significant.

example: The school statistics teacher determined that the probability that Todd would only average 30 yards per throw if he really could throw 50 yards is 0.002. This value is so low that it seems unlikely to have occurred by chance, and so we say that the finding is significant. It is lower than any of the commonly accepted significance levels.

P -Value

We said that a finding is statistically significant, or significant, if it is unlikely to have occurred by chance. P -value is what tells us just how unlikely a finding actually is under the model based on the null hypothesis. The P -value is the probability based on our model of getting a sample statistic as extreme, or more extreme, as the one we obtained by chance alone. This requires that we have some expectation about what we ought to get. In other words, the P -value is the probability of getting a statistic at least as far removed from expected as we got. A decision about significance can then be made by comparing the obtained P -value with a stated value of α.

example: Suppose it turns out that Todd”s 50 throws are approximately normally distributed with mean 47.5 yards and standard deviation 8 yards. His claim is that he can average 50 yards per throw. What is the probability of getting an observed mean this far below the expected 50 yards by chance alone (that is, what is the P -value) if his true average is 50 yards (assume the population standard deviation is 8 yards)? Is this finding significant at α = 0.05? At α = 0.01?

solution: We are assuming the population is normally distributed with mean 50 and standard deviation 8. The situation is pictured below:

This is the P -value: it”s the probability of getting a sample mean as far below 50 as we did by chance alone, assuming the model based on a mean of 50 yards is correct. This finding is significant at the 0.05 level but not (quite) at the 0.01 level.

The Hypothesis-Testing Procedure

So far we have used confidence intervals to estimate the value of a population parameter (μ, p, μ ₁ – μ ₂ , p ₁ – p ₂ ). In the coming chapters, we test whether the parameter has a particular value or not. More accurately, might ask if we have convincing evidence against the hypothesis that p ₁ – p ₂ = 0 or if μ = 3, for example. That is, we will test the hypothesis that, say, p ₁ – p ₂ = 0. In the hypothesis-testing procedure, a researcher does not look for evidence to support this hypothesis, but instead looks for evidence against the hypothesis. The process looks like this.

State the null and alternative hypotheses in the context of the problem. The first hypothesis, the null hypothesis , is the hypothesis we are actually testing. The null hypothesis usually states that there is nothing going on: the claim is correct or that there is no distinction between groups. It is symbolized by H ₀ . An example of a typical null hypothesis would be H ₀ : μ ₁ – μ ₂ = 0 or H ₀ : μ ₁ = μ ₂ . This is the hypothesis that μ ₁ and μ ₂ are the same, or that populations 1 and 2 have the same mean. Note that μ ₁ and μ ₂ must be identified in context (for example, μ ₁ = the mean score for all people in the population before training).

The second hypothesis, the alternative hypothesis , is the theory that the researcher wants to confirm by rejecting the null hypothesis. The alternative hypothesis is symbolized by H _Aor H _a. There are three possible forms for the alternative hypothesis: ≠, >, or <. If the null is H ₀ : μ ₁ – μ ₂ = 0, then H _Acould be:

H _A: μ ₁ – μ ₂ ≠ 0 (this is called a two-sided alternative )

H _A: μ ₁ – μ ₂ > 0 (this is a one-sided alternative )

H _A: μ ₁ – μ ₂ < 0 (also a one-sided alternative ).

(In the case of the one-sided alternative H _A: μ ₁ – μ ₂ > 0, the null hypothesis is sometimes written: H ₀ : μ ₁ – μ ₂ ≤ 0. This actually makes pretty good sense: if the researcher is wrong in a belief that the difference is greater than 0, then any finding less than or equal to 0 fails to provide evidence in favor of the alternative.)

Identify which procedure you intend to use and show that the conditions for its use are present. We identified the conditions for constructing a confidence interval in the first two sections of this chapter. We will identify the conditions needed to do hypothesis testing in the following chapters. For the most part, they are similar to those you have already studied.

If you are going to state a significance level α it can be done here.

Compute the value of the test statistic and the P-value.
Give a conclusion, linked to your computations, in the context of the problem.

Exam Tip: The four steps above have been incorporated into AP exam scoring for any question involving a hypothesis test. Note that the third item (compute the value of the test statistic and the P -value), the mechanics in the problem, is only one part of a complete solution. All four steps must be present in order to receive a 4 (“complete response”) on the problem.

If you stated a significance level in the second step of the process, the conclusion can be based on a comparison of the P -value with α. If you didn”t state a significance level, you can argue your conclusion based on the value of the P -value alone: if it is small, you have evidence against the null; if it is not small, you do not have evidence against the null.

Many statisticians will argue that you are better off to argue directly from the P -value and not use a significance level. One reason for this is the arbitrariness of the P -value. That is, if α = 0.05, you would reject the null hypothesis for a P -value of 0.04999 but not for a P -value of 0.05001 when, in reality, there is no practical difference between them.

The conclusion can be (1) that we reject H ₀ (because of a sufficiently small P -value) or (2) that we do not reject H ₀ (because the P -value is too large). We do NOT accept the null: we either reject it or fail to reject it. If we reject H ₀ , we can say in context that have convincing evidence in favor of H _A.

example: Consider, one last time, Todd and his claim that he can throw a ball 50 yards. His average toss, based on 50 throws, was 47.5 yards, and we assumed the population standard deviation was the same as the sample standard deviation, 8 yards. A test of the hypothesis that Todd can throw the ball 50 yards on average against that alternative that he can”t throw that far might look something like the following (we will fill in many of the details, e specially those in the third part of the process, in the following chapters):

Letμ be the true average distance Todd can throw a football. H ₀ : μ = 50 (or H ₀ : μ ≥ 50, since the alternative is one sided) and H _A: μ < 50
Since we knowσ , we will use a z -test. We assume the 50 throws is an SRS of all his throws, and the central limit theorem tells us that the sampling distribution of is approximately normal. We will use a significance level of α = 0.05.
In the previous section, we determined that theP- value for this situation (the probability of getting an average as far away from our expected value as we got) is 0.014.
Since theP- value < α (0.014 < 0.05), we can reject H ₀ . We have good evidence that the true mean distance Todd can throw a football is actually less than 50 yards (note that we aren”t claiming anything about how far Todd can actually throw the ball on average, just that it”s likely to be less than 50 yards).

Type I and Type II Errors and the Power of a Test

When we do a hypothesis test as described in the previous section, we never really know if we have made the correct decision or not. We can try to minimize our chances of being wrong, but there are trade-offs involved. If we are given a hypothesis, it may be true or it may be false. We can decide to reject the hypothesis or not to reject it. This leads to four possible outcomes:

Two of the cells in the table are errors and two are not. Filling those in, we have

Note that, in statistics, the term “error” does not mean somebody did something wrong. It refers to variability. An “error” occurs because the sampling variability caused a sample statistic that led us to the wrong decision. The errors have names that are rather unspectacular: If the (null) hypothesis is true, and we mistakenly reject it, it is a Type I error . If the hypothesis is false, and we mistakenly fail to reject it, it is a Type II error . We note that the probability of a Type I error is equal to α, the significance level. (This is because a P -value < α causes us to reject H ₀ . If H ₀ is true, and we decide to reject it because we got an unusual sample, we have made a Type I error). We call the probability of a Type II error β . Filling in the table with this information, we have:

The cell in the lower right-hand corner is important. The probability of correctly rejecting a false hypothesis (in favor of the alternative) is called the power of the test . The power of the test equals 1 – β . Finally, then, our decision table looks like this:

Exam Tip: You will not need to know how to actually calculate P(Type II error ) or the power of the test on the AP exam. You will need to understand the concept of each, what affects each, and how they are related.

example: Sticky Fingers is arrested for shoplifting. The judge, in her instructions to the jury, says that Sticky is innocent until proven guilty. That is, the jury”s hypothesis is that Sticky is innocent. Identify Type I and Type II errors in this situation and explain the consequence of each.

solution: Our hypothesis is that Sticky is innocent. A Type I error involves mistakenly rejecting a true hypothesis. In this case, Sticky is innocent, but because we reject innocence, he is found guilty. The risk in a Type I error is that Sticky goes to jail for a crime he didn”t commit.

A Type II error involves failing to reject a false hypothesis. If the hypothesis is false, then Sticky is guilty, but because we think he”s innocent, we acquit him. The risk in Type II error is that Sticky goes free even though he is guilty.

In life we often have to choose between possible errors. In the example above, the choice was between sending an innocent person to jail (a Type I error) or setting a criminal free (a Type II error). Which of these is the most serious error is not a statistical question—it”s a social one.

We can decrease the chance of Type I error by adjusting α. By making a very small, we could virtually ensure that we would never mistakenly reject a true hypothesis. However, this would result in a large Type II error because we are making it hard to reject the null under any circumstance, even when it is false.

The probability of making a Type II error is smaller and, hence, the power of the test greater if:

The sample size is increased.
The standard deviation of our sample data is decreased (this is not always under the control of the researcher but, for example, if more precise measurements are possible the variability in the data could be reduced.).
We increase the significance level (α). (This could be seen as dishonest – manipulating the significance level to get the result you want.)
The effect size (the difference between the hypothesized parameter and the true value) is larger. A bigger difference is easier to detect!

example: A package delivery company claims that it is on time 90% of the time. Some of its clients aren”t so sure, thinking that there are often delays in delivery beyond the time promised. The company states that it will change its delivery procedures if it is wrong in its claim. Suppose that, in fact, there are more delays than claimed by the company. Which of the following is equivalent to the power of the test?

(a) The probability that the company will not change its delivery procedures

(b) The P -value > α

(d) The probability that the company will change its delivery procedures

(e) The probability that the company will fail to reject H ₀

solution: The power of the test is the probability of rejecting a false null hypothesis in favor of an alternative—in this case, the hypothesis that the company is on time 90% of the time is false. If we correctly reject this hypothesis, the company will change its delivery procedures. Hence, (d) is the correct answer.

Rapid Review

True–False. A 95% confidence interval for a population proportion is given as (0.37, 0.52). This means that the probability is 0.95 that this interval contains the true proportion.

Answer: False. Because this interval has been created, there is no repeatable random event. That”s why we say “We are 95% confident.” It avoids improper use of the word probability. (The probability is 0.95 that the processused to generate this interval will capture the true proportion.)

The hypothesis that the Giants would win the World Series in 2002 was held by many of their fans. What type of error has been made by a very serious fan who refuses to accept the fact that the Giants actually lost the series to the Angels?

Answer: The hypothesis is false but the fan has failed to reject it. That is a Type II error.

What is the critical value oft for a 99% confidence interval based on a sample size of 26?

Answer: From the table of t distribution critical values, t * = 2.787 with 25 df. Using a TI-84 with the invT function, the answer is given by invT(0.995,25 ) . The 99% interval leaves 0.5% = 0.005 in each tail so that the area to the left of t * is 0.99 + 0.005 = 0.995.

What is the critical value ofz for a 98% confidence interval for a population whose standard deviation we know?

Answer: This time we have to use the table of standard normal probabilities, Table A. If C = 0.98, 0.98 of the area lies between z * and –z *. So, because of the symmetry of the distribution, 0.01 lies above z *, which is the same as saying that 0.99 lies to the left of z *. The nearest table entry to 0.99 is 0.9901, which corresponds to z * = 2.33. Using the invNorm function on the TI-83/84, the answer is given by invNorm(0.99 ) .

A hypothesis test is conducted with α = 0.01. TheP -value is determined to be 0.037. Because the P -value > α, are we justified in rejecting the null hypothesis?

Answer: No. We could only reject the null if the P -value were less than the significance level. It is small probabilities that provide evidence against the null.

Mary comes running into your office and excitedly tells you that she got a statistically significant finding from the data on her most recent research project. What is she talking about?

Answer: Mary means that the finding she got had such a small probability of occurring by chance that she has concluded it probably wasn”t just chance variation but a real difference from expected.

You want to create a 95% confidence interval for a population proportion with a margin of error of no more than 0.05. How large a sample do you need?

Answer: Because there is no indication in the problem that we know about what to expect for the population proportion, we will use P * = 0.5. Then,

You would need a minimum of 385 subjects for your sample. (Always round up for the minimum sample size required.)

Which of the following statements is correct?
Thet distribution has more area in its tails than the z distribution (normal).
When constructing a confidence interval for a population mean, you would always usez * rather than t * if you have a sample size of at least 30 (n > 30).

III. When constructing a two-sample t interval, the “conservative” method of choosing degrees of freedom (df = min {n ₁ – 1, n ₂ – 1}) will result in a wider confidence interval than other methods.

Answer:

I is correct. At distribution, because it must estimate the population standard deviation, has more variability than the normal distribution.
II is not correct. It is definitely not correct that you would always usez * rather than t * in this situation. A more interesting question is could you use z * rather than t *? The answer to that question is a qualified “yes.” The difference between z * and t * is small for large sample sizes (e.g., for a 95% confidence interval based on a sample size of 50, z * = 1.96 and t * = 2.01) and, while a t interval would have a somewhat larger margin of error, the intervals constructed would capture roughly the same range of values. In fact, many traditional statistics books teach this as the proper method. Now, having said that, the best advice is to always use t when dealing with a one-sample situation when s is unknown (confidence interval or hypothesis test) and use z only when you know, or have a very good estimate of, the population standard deviation.
III is correct. The conservative method (df = min{n₁ – 1, n ₂ – 1}) will give a larger value of t *, which, in turn, will create a larger margin of error, which will result in a wider confidence interval than other methods for a given confidence level.

Practice Problems

Multiple-Choice

You are going to create a 95% confidence interval for a population proportion and want the margin of error to be no more than 0.05. Historical data indicate that the population proportion has remained constant at about 0.7. What is the minimum size random sample you need to construct this interval?
385
322
274
275
323
Which of the following will increase the power of a test?
Increasen .
Increase α.
Reduce the amount of variability in the sample.
Consider an alternative hypothesis further from the null.
All of these will increase the power of the test.
Under a null hypothesis, a sample value yields a P -value of 0.015. Which of the following statements is (are) true?
This finding is statistically significant at the 0.05 level of significance.
This finding is statistically significant at the 0.01 level of significance.

III. The probability of getting a sample value as extreme as this one by chance alone if the null hypothesis is true is 0.015.

I and III only
I only
III only
II and III only
I, II, and III
You are going to construct a 90% t confidence interval for a population mean based on a sample size of 16. What is the critical value of t (t *) you will use in constructing this interval?
1.341
1.753
1.746
2.131
1.337
A 95% confidence interval for the difference between two population proportions is found to be (0.07, 0.19). Which of the following statements is (are) true?
It is unlikely that the two populations have the same proportions.
We are 95% confident that the true difference between the population proportions is between 0.07 and 0.19.

III. The probability is 0.95 that the true difference between the population proportions is between 0.07 and 0.19.

I only
II only
I and II only
I and III only
II and III only
A 99% confidence interval for the true mean weight loss (in pounds) for people on the SkinnyQuick diet plan is found to be (1.3, 5.2). Which of the following is (are) correct?
The probability is 0.99 that the mean weight loss is between 1.3 lbs and 5.2 lbs.
The probability is 0.99 that intervals constructed by this process will capture the true population mean.

III. We are 99% confident that the true mean weight loss for this program is between 1.3 lbs and 5.2 lbs.

This interval provides evidence that the SkinnyQuick plan is effective in reducing the mean weight of people on the plan.
I and II only
II only
II and III only
II, III, and IV only
All of these statements are correct.
In a test of the null hypothesis H ₀ : p = 0.35 with α = 0.01, against the alternative hypothesis H _A: p < 0.35, a large random sample produced a z -score of –2.05. Based on this, which of the following conclusions can be drawn?
It is likely thatp < 0.35.
p< 0.35 only 2% of the time.
If thez -score were positive instead of negative, we would be able to reject the null hypothesis.
We do not have sufficient evidence to claim thatp < 0.35.
1% of the time we will reject the alternative hypothesis in error.
A 99% confidence interval for the weights of a random sample of high school wrestlers is reported as (125, 160). Which of the following statements about this interval is true?
At least 99% of the weights of high school wrestlers are in the interval (125, 160).
The probability is 0.99 that the true mean weight of high school wrestlers is in the interval (125, 160).
99% of all samples of this size will yield a confidence interval of (125, 160).
The procedure used to generate this confidence interval will capture the true mean weight of high school wrestlers 99% of the time.
The probability is 0.99 that a randomly selected wrestler will weigh between 125 and 160 lbs.
This year”s statistics class was small (only 15 students). This group averaged 74.5 on the final exam with a sample standard deviation of 3.2. Assuming that this group is a random sample of all students who have taken statistics and the scores in the final exam for all students are approximately normally distributed, which of the following is an approximate 96% confidence interval for the true population mean of all statistics students?
74.5 ± 7.245
74.5 ± 7.197
74.5 ± 1.871
74.5 ± 1.858
74.5 ± 1.772
A paint manufacturer advertises that one gallon of its paint will cover 400 sq ft of interior wall. Some local painters suspect the average coverage is considerably less and decide to conduct an experiment to find out. If μrepresents the true average number of square feet covered by the paint, which of the following are the correct null and alternative hypotheses to be tested?
H₀ : μ = 400, H _A: μ > 400
H₀ : μ ≥ 400, H _A: μ ≠ 400
H₀ : μ = 400, H _A: μ ≠ 400
H₀ : μ ≠ 400, H _A: μ < 400
H₀ : μ ≥ 400, H _A: μ < 400

Free-Response

You attend a large university with approximately 15,000 students. You want to construct a 90% confidence interval estimate, within 5%, for the proportion of students who favor outlawing country music. How large a sample do you need?
The local farmers association in Cass County wants to estimate the mean number of bushels of corn produced per acre in the county. A random sample of 13 1-acre plots produced the following results (in number of bushels per acre): 98, 103, 95, 99, 92, 106, 101, 91, 99, 101, 97, 95, 98. Construct a 95% confidence interval for the mean number of bushels per acre in the entire county. The local association has been advertising that the mean yield per acre is 100 bushels. Do you think it is justified in this claim?
Two groups of 40 randomly selected students were selected to be part of a study on dropout rates. Members of one group were enrolled in a counseling program designed to give them skills needed to succeed in school, and the other group received no special counseling. Fifteen of the students who received counseling dropped out of school, and 23 of the students who did not receive counseling dropped out. Construct a 90% confidence interval for the true difference between the dropout rates of the two groups. Interpret your answer in the context of the problem.
A hotel chain claims that the average stay for its business clients is 5 days. One hotel believes that the true stay may actually be fewer than 5 days. A study conducted by the hotel of 100 randomly selected clients yields a mean of 4.55 days with a standard deviation of 3.1 days. What is the probability of getting a finding as extreme, or more extreme than 4.55, if the true mean is really 5 days? That is, what is the P -value of this finding?
One researcher wants to construct a 99% confidence interval as part of a study. A colleague says such a high level isn”t necessary and that a 95% confidence level will suffice. In what ways will these intervals differ?
A 95% confidence interval for the true difference between the mean ages of male and female statistics teachers is constructed based on a sample of 95 males and 62 females. Consider each of the following intervals that might have been constructed:
(−4.5, 3.2)
(2.1, 3.9)

III. (−5.2, –1.7)

For each of these intervals,

(a) Interpret the interval, and

(b) Describe the conclusion about the difference between the mean ages that might be drawn from the interval.

A 99% confidence interval for a population mean is to be constructed. A sample of size 20 will be used for the study. Assuming that the population from which the sample is drawn is approximately normal, what is the upper critical value needed to construct the interval?
A university is worried that it might not have sufficient housing for its students for the next academic year. It”s very expensive to build additional housing, so it is operating under the assumption (hypothesis) that the housing it has is sufficient, and it will spend the money to build additional housing only if it is convinced it is necessary (that is, it rejects its hypothesis).

(a) For the university”s assumption, what is the risk involved in making a Type I error?

(b) For the university”s assumption, what is the risk involved in making a Type II error?

A flu vaccine is being tested for effectiveness. To test this, 350 randomly selected people are given the vaccine and observed to see if they develop the flu during the flu season. At the end of the season, 55 of the 350 did get the flu. Construct and interpret a 95% confidence interval for the true proportion of people who will get the flu despite getting the vaccine.
A research study gives a 95% confidence interval for the proportion of subjects helped by a new anti-inflammatory drug as (0.56, 0.65).

(a) Interpret this interval in the context of the problem.

(b) What is the meaning of “95%” confidence interval as stated in the problem?

A study was conducted to see if attitudes toward travel have changed over the past year. In the prior year, 25% of American families took at least one vacation away from home. In a random sample of 100 families this year, 29 families took a vacation away from home. What is the P -value of getting a finding this different from expected?

(Note: s is computed somewhat differently for a hypothesis test about a population proportion than s for constructing a confidence interval to estimate a population proportion. Specifically, for a confidence interval,

and, for a hypothesis test,

where p ₀ is the hypothesized value of p in H ₀ : p = p ₀ (p ₀ = 0.25 in this exercise). We do more with this in the next chapter, but you should use

for this problem.)

A study was conducted to determine if male and female 10th graders differ in performance in mathematics. Twenty-three randomly selected males and 26 randomly selected females were each given a 50-question multiple-choice test as part of the study. The scores were approximately normally distributed. The results of the study were as follows:

Construct a 99% confidence interval for the true difference between the mean score for males and the mean score for females. Does the interval suggest that there is a difference between the true means for males and females?

Under H ₀ : μ = 35, H _A: μ > 35, a decision rule is decided upon that rejects H ₀ for > 36.5. For the sample, s = 0.99. If, in reality, μ = 38, what is the power of the test?
You want to estimate the proportion of Californians who want to outlaw cigarette smoking in all public places. Generally speaking, by how much must you increase the sample size to cut the margin of error in half?
The Mathematics Department wants to estimate within five students, and with 95% confidence, how many students will enroll in Statistics next year. They plan to ask a sample of eligible students whether or not they plan to enroll in Statistics. Over the past 5 years, the course has had between 19 and 79 students enrolled. How many students should they sample? (Note: assuming a reasonably symmetric distribution, we can estimate the standard deviation by Range/4.)
A hypothesis test is conducted with α = 0.05 to determine the true difference between the proportion of male and female students enrolling in Statistics (H ₀ : p ₁ – p ₂ = 0). The P -value of ₁ – ₂ is determined to be 0.03. Is this finding statistically significant ? Explain what is meant by a statistically significant finding in the context of the problem.
Based on the 2000 census, the population of the United States was about 281.4 million people, and the population of Nevada was about 2 million. We are interested in generating a 95% confidence interval, with a margin of error of 3%, to estimate the proportion of people who will vote in the next presidential election. How much larger a sample will we need to generate this interval for the United States than for the state of Nevada?
Professor Olsen has taught statistics for 41 years and has kept the scores of every test he has ever given. Every test has been worth 100 points. He is interested in the average test score over the years. He doesn”t want to put all of the scores (there are thousands of them) into a computer to figure out the exact average, so he asks his daughter, Anna, to randomly select 50 of the tests and use those to come up with an estimate of the population average. Anna has been studying statistics at college and decides to create a 98% confidence interval for the true average test score. The mean test score for the 50 random selected tests she selects is 73.5 with a standard deviation of 7.1. What does she tell her father?
A certain type of pen is claimed to operate for a mean of 190 hours. A random sample of 49 pens is tested, and the mean operating time is found to be 188 hours with a standard deviation of 6 hours.

(a) Construct a 95% confidence interval for the true mean operating time of this type of pen. Does the company”s claim seem justified?

(b) Describe the steps involved in conducting a hypothesis test, at the 0.05 level of significance, that the true mean differs from 190 hours. Do not actually carry out the complete test, but do state the null and alternative hypotheses.

A young researcher thinks there is a difference between the mean ages at which males and females win Oscars for best actor or actress. The student found the mean age for all best actor winners and all best actress winners and constructed a 95% confidence interval for the mean difference between their ages. Is this an appropriate use of a confidence interval? Why or why not?

Cumulative Review Problems

Use a normal approximation to the binomial to determine the probability of getting between 470 and 530 heads in 1000 flips of a fair coin.
A survey of the number of televisions per household found the following probability distribution:

What is the mean number of television sets per household?

A bag of marbles contains four red marbles and five blue marbles. A marble is drawn, its color is observed, and it is returned to the bag.

(a) What is the probability that the first red marble is drawn on trial 3?

(b) What is the average wait until a red marble is drawn?

A study is conducted to determine which of two competing weight-loss programs is the most effective. Random samples of 50 people from each program are evaluated for losing and maintaining weight loss over a 1-year period. The average number of pounds lost per person over the year is used as a basis for comparison.

(a) Why is this an observational study and not an experiment?

(b) Describe an experiment that could be used to compare the two programs. Assume that you have available 100 overweight volunteers who are not presently in any program.

The correlation between the first and second statistics tests in a class is 0.78.

(a) Interpret this value.

(b) What proportion of the variation in the scores on the second test can be explained by the scores on the first test?

Solutions to Practice Problems

Multiple-Choice

The correct answer is (e).

P = 0.7, M = 0.05, z * = 1.96 (for C = 0.95) ⇒

. You need a sample of at least

n = 323.

The correct answer is (e).
The correct answer is (a). It is not significant at the .01 level because .015 is greater than .01.
The correct answer is (b). n = 16 ⇒ df = 16 – 1 = 15. Using a table of t distribution critical values, look in the row for 15 degrees of freedom and the column with 0.05 at the top (or 90% at the bottom). On a TI-84 with the invTfunction, the solution is given by invT (0.95,15) .
The correct answer is (c). Because 0 is not in the interval (0.07, 0.19), it is unlikely to be the true difference between the proportions. III is just plain wrong! We cannot make a probability statement about an interval we have already constructed. All we can say is that the process used to generate this interval has a 0.95 chance of producing an interval that does contain the true population proportion.
The correct answer is (d). I is not correct since you cannot make a probability statement about a found interval—the true mean is either in the interval (P = 1) or it isn”t (P = 0). II is correct and is just a restatement of “Intervals constructed by this procedure will capture the true mean 99% of the time.” III is true—it”s our standard way of interpreting a confidence interval. IV is true. Since the interval constructed does not contain 0, it”s unlikely that this interval came from a population whose true mean is 0. Since all the values are positive, the interval does provide statistical evidence (but not proof) that the program is effective at promoting weight loss. It does not give evidence that the amount lost is of practical importance.
The correct answer is (d). To reject the null at the 0.01 level of significance, we would need to have z < –2.33.
The correct answer is (d). A confidence level is a statement about the procedure used to generate the interval, not about any one interval. It”s difficult to use the word “probability” when interpreting a confidence interval and impossible when describing an interval that has already been constructed. However, you could say, “The probability is 0.99 that an interval constructed in this manner will contain the true population proportion.”
The correct answer is (c). For df = 15 – 1 = 14, t * = 2.264 for a 96% confidence interval (from Table B; if you have a TI-84 with the invT function, invT(0.98,14)= 2.264).

The interval is

The correct answer is (e). Because we are concerned that the actual amount of coverage might be less than 400 sq ft, the only options for the alternative hypothesis are (d) and (e) (the alternative hypothesis in (a) is in the wrong direction, and the alternatives in (b) and (c) are two sided). The null hypothesis given in (d) is not a form we would use for a null (the only choices are =, ≤, or ≥). We might see H ₀ : μ = 400 rather than H ₀ : μ ≥ 400. Both are correct statements of a null hypothesis against the alternative H _A : μ < 400.

Free-Response

C = 0.90 ⇒ z ^* = 1.645, .

You would need to survey at least 271 students.

The population standard deviation is unknown, and the sample size is small (13), so we need to use a t procedure. The problem tells us that the sample is random. A histogram of the data shows no significant departure from normality:

Now, = 98.1, s = 4.21, df = 13 – 1 = 12 ⇒ t * = 2.179. The 95% confidence interval is

Because 100 is contained in this interval, we do not have strong evidence that the mean number of bushels per acre differs from 100, even though the sample mean is only 98.1.

This is a two-proportion situation. We are told that the groups were randomly selected, but we need to check that the samples are sufficiently large:

. Since all values are greater than or equal to 5, we are justified in constructing a two-proportion z interval. For a 90% z confidence interval, z * = 1.645.

Thus, .

We are 90% confident that the true difference between the dropout rates is between 0.02 and 0.38. Since 0 is not in this interval, we have evidence that the counseling program was effective at reducing the number of dropouts.

In this problem, H ₀ : μ = 5 and H _A: μ < 5, so we are only interested in the area to the left of our finding of = 4.55 since the hotel believes that the average stay is less than 5 days. We are interested in the area shaded in the graph:

Since we do not know σ , but do have a large sample size, we will use a t procedure. , df = 100 – 1. Using df = 80 from Table B (rounding down from 99), we have 0.05 < P -value < 0.10. Using a TI-83/84 with df = 99, the P -value = tcdf(-100,-1.45,99)=0.075.

The 99% confidence interval will be more likely to contain the population value being estimated, but will be wider than a 95% confidence interval.
I. (a) We are 95% confident that the true difference between the mean age of male statistics teachers and female statistics teachers is between –4.5 years and 3.2 years.

(b) Since 0 is contained in this interval, we do not have evidence of a statistically significant difference between the mean ages of male and female statistics teachers.

(a) We are 95% confident that the true difference between the mean age of male statistics teachers and female statistics teachers is between 2.1 years and 3.9 years.

(b) Since 0 is not in this interval, we do have evidence of a real difference between the mean ages of male and female statistics teachers. In fact, since the interval contains only positive values, we have evidence that the mean age of male statistics teachers is greater than the mean age of female statistics teachers.

III. (a) We are 95% confident that the true difference between the mean age of male statistics teachers and female statistics teachers is between –5.2 years and –1.7 years.

(b) Since 0 is not in this interval, we have evidence of a real difference between the mean ages of male and female statistics teachers. In fact, since the interval contains only negative values, we have evidence that the mean age of male statistics teachers is less than the mean age of female statistics teachers.

t procedures are appropriate because the population is approximately normal. n = 20 ⇒ df = 20 – 1 = 19 ⇒ t * = 2.861 for C = 0.99.
(a) A Type I error is made when we mistakenly reject a true null hypothesis. In this situation, that means that we would mistakenly reject the true hypothesis that the available housing is sufficient. The risk would be that a lot of money would be spent on building additional housing when it wasn”t necessary.

(b) A Type II error is made when we mistakenly fail to reject a false hypothesis. In this situation that means we would fail to reject the false hypothesis that the available housing is sufficient. The risk is that the university would have insufficient housing for its students.

The conditions are present to construct a one-proportion z interval.

We are 95% confident that the true proportion of people who will get the flu despite getting the vaccine is between 11.9% and 19.5%. To say that we are 95% confident means that the process used to construct this interval will, in fact, capture the true population proportion 95% of the time (that is, our “confidence” is in the procedure, not in the interval we found!).

(a) We are 95% confident that the true proportion of subjects helped by a new anti-inflammatory drug is (0.56, 0.65).

(b) The process used to construct this interval will capture the true population proportion, on average, 95 times out of 100.

We have . Because the hypothesis is two sided, we are concerned about the probability of being in either tail of the curve even though the finding was larger than expected.

Then, the P -value = 2(0.1788) = 0.3576. Using the TI-83/84, the P -value = 2normalcdf(0.92,100) .

The problem states that the samples were randomly selected and that the scores were approximately normally distributed, so we can construct a two-sample t interval using the conservative approach. For C = 0.99, we have df = min{23 – 1, 26 – 2} = 22 ⇒ t * = 2.819.

Using a TI-84 with the invT function, t * = invT(0.995,22 ) .

0 is a likely value for the true difference between the means because it is in the interval. Hence, we do not have evidence that there is a difference between the true means for males and females.

Using the STAT TESTS 2-SampTInt function on the TI-83/84, we get an interval of (-5.04, 7.24), df = 44.968. Remember that the larger number of degrees of freedom used by the calculator results in a somewhat narrower interval than the conservative method (df = min{n ₁ – 1, n ₂ – 1}) for computing degrees of freedom.

The power of the test is our ability to reject the null hypothesis. In this case, we reject the null if > 36.5 when μ = 38. We are given s = 0.99. Thus

If the true mean is really 38, we are almost certain to reject the false null hypothesis.

For a given margin of error using P * = 0.5:

To reduce the margin of error by a factor of 0.5, we have

We need to quadruple the sample size to reduce the margin of error by a factor of ₁ /₂ .

Hence,

The department should ask at least 35 students about their intentions.

The finding is statistically significant because the P -value is less than the significance level. In this situation, it is unlikely that we would have obtained a value of ₁ – ₂ = 0. as different from 0 as we got by chance alone if, in fact, ₁ – ₂ = 0.
Trick question! The sample size needed for a 95% confidence interval (or any C -level interval for that matter) is not a function of population size. The sample size needed is given by

n is a function of z * (which is determined by the confidence level), M (the desired margin of error), and P * (the estimated value of ). The only requirement is that the population size be at least 20 times as large as the sample size.

Because n = 50, we could use a large-sample confidence interval. For n = 50, z * = 2.33 (that”s the upper critical z -value if we put 1% in each tail).

Anna tells her father that he can be 98% confident that the true average test score for his 41 years of teaching is between 71.16 and 75.84.

Since we do not know σ, however, a t interval is more appropriate. The TI-83/84 calculator returns a t -interval of (71.085, 75.915). Here, t * = 2.4049, so that the resulting interval is slightly wider, as expected, than the interval obtained by assuming an approximately normal distribution.

The problem states that we have an SRS and n = 49. Since we do not know s , but have a large sample size, we are justified in using t procedures.

(a) C = 0.95 ⇒ t * = 2.021 (from Table B with df = 40; from a TI-84 with the invT function, the exact value of t * = 2.0106 with df = 48). Thus, . Using the TI-83/84, we get (186.28, 189.72) based on 48 degrees of freedom. Since 190 is not in this interval, it is not a likely population value from which we would have gotten this interval. There is some doubt about the company”s claim.

(b) Let μ = the true mean operating time of the company”s pens.

H₀ : μ = 190.
H_A: μ ≠ 190.

(The wording of the questions tells us H _Ais two sided.)

We will use a one-samplet -test. Justify the conditions needed to use this procedure.
Determine the test statistic (t) and use it to identify the P -value.
Compare theP -value with α. Give a conclusion in the context of the problem.

It is not appropriate because confidence intervals use sample data to make estimates about unknown population values. In this case, the actual difference in the ages of actors and actresses is known, and the true difference can be calculated.

Solutions to Cumulative Review Problems

Let X = the number of heads. Then X has B (1000, 0.5) because the coin is fair. This binomial can be approximated by a normal distribution with mean = 1000(0.5) = 500 and standard deviation

calculator, normalcdf(–1.9,1.9 ).

μ _x= 0(0.3) + 1(0.37) + 2(0.46) + 3(0.10) + 4(0.04) = 1.75.
(a) P (draw a red marble) = 4/9

(b) Average wait

(a) It is an observational study because the researchers are simply observing the results between two different groups. To do an experiment, the researcher must manipulate the variable of interest (different weight-loss programs) in order to compare their effects.

(b) Inform the volunteers that they are going to be enrolled in a weight-loss program and their progress monitored. As they enroll, randomly assign them to one of two groups, say Group A and Group B (without the subjects” knowledge that there are really two different programs). Group A gets one program and Group B the other. After a period of time, compare the average number of pounds lost for the two programs.

(a) There is a moderate to strong positive linear relationship between the scores on the first and second tests.

(b) 0.78² = 0.61. About 61% of the variation in scores on the second test can be explained by the regression on the scores from the first test.

CHAPTER 11

Confidence Intervals and Introduction to Inference

A recent poll estimated that about 18% of the U.S. population live in a multigenerational family household. The 95% confidence interval is from 15% to 21%. Which of the following correctly interprets the meaning of “95% confident”?

(A) If many samples were taken, about 95% of the confidence intervals would contain the actual proportion of the U.S. population that live in a multigenerational family household.

(B) If many samples were taken, about 95% of the samples would have between 15% and 21% of those sampled living in a multigenerational family household.

(C) There is a 95% probability that any particular sample would have between 15% and 21% of those sampled living in a multigenerational family household.

(D) Ninety-five percent of people sampled live in a multigenerational family household.

(E) Between 15% and 21% of the U.S. population live in a multigenerational family household about 95% of the time.

A quality control manager for a cookie company selected a random sample of 20 chocolate chip cookies from all chocolate chip cookies produced on a particular day. She counted the number of chocolate chips in each cookie; the average number of chips was 26.2, and the standard deviation was 2.43. The distribution of number of chocolate chips was reasonably symmetric and mound-shaped. The 92% confidence interval for the mean number of chips for all cookies produced that day is

(A)

(B)

(C)

(D)

(E)

Why is t used instead of z as the critical value in a confidence interval for a mean?

(A) Because means are continuous, not discrete like proportions.

(B) Because means are greater than proportions.

(D) Because we are using to estimate μ .

(E) Because all confidence intervals require t .

A polling company reported that 38% of Americans approve of a particular policy, with a margin of error of 4 percentage points and 95% confidence. What is the best interpretation of that margin of error?

(A) In 95% of similar polls, the sample proportion would fall within 4 percentage points of 38%.

(B) In 95% of similar polls, the sample proportion would fall within 4 percentage points of the proportion of all Americans who approve of the policy.

(D) The survey result is wrong 4% of the time, with 95% confidence.

(E) The survey is correct 95% of the time and incorrect 4% of the time.

A particular skin condition clears up on its own after about a month in about 40% of cases. A new topical medication has been developed to increase the proportion of patients who are cured after about a month, but the new medication would cost money. The null hypothesis is that the proportion of patients cured with the new medication is the same as the proportion cured with a placebo, and the alternative hypothesis is that the proportion of patients cured with the new medication is greater than the proportion cured with the placebo. In this situation, the consequence of a Type II error would be

(A) people paying more for a medication that does not help them.

(B) people not having access to a medication that would help them.

(D) the medication being given to the wrong patients.

(E) the proportion of patients cured with the new medication turning out to be less than the proportion cured with the placebo.

Answers