Design of a Study: Sampling, Surveys, and Experiments - Review the Knowledge You Need to Score High

5 Steps to a 5 AP Statistics 2017 (2016)

STEP 4

Review the Knowledge You Need to Score High

CHAPTER 8 Design of a Study: Sampling, Surveys, and Experiments

IN THIS CHAPTER

Summary: This chapter stands pretty much alone in this course. All other chapters are concerned with how to analyze and make inferences from data that have been collected. In this chapter, we discuss how to collect the data. Doing an analysis of bad or poorly collected data has no practical application. Here you”ll learn about surveys and sampling, observational studies, experiments, and the types of bias that can creep into each of them. Once you understand all this, you”ll have greater confidence that your analysis actually represents something more than meaningless number crunching.

Key Ideas

Samples and Sampling

Surveys

Sampling Bias

Experiments and Observational Studies

Statistical Significance

Completely Randomized Design

Matched Pairs Design

Blocking

Samples

In the previous two chapters we concentrated on the analysis of data at hand—we didn”t worry much about how the data came into our possession. The last part of this book deals with statistical inference—making statements about a population based on samples drawn from the population. In both data analysis and inference, we would like to believe that our analyses, or inferences, are meaningful. If we make a claim about a population based on a sample, we want that claim to be true. Our ability to do meaningful analyses and make reliable inferences is a function of the data we collect. To the extent that the sample data we deal with are representative of the population of interest, we are on solid ground. No interpretation of data that are poorly collected or systematically biased will be meaningful. We need to understand how to gather quality data before proceeding on to inference. In this chapter, we study techniques for gathering data so that we have reasonable confidence that they are representative of our population of interest.

Census

We usually want to know something about the entire population of interest. The way to find that out for sure is to conduct a census , a procedure by which every member of a population is selected for study. Doing a census, especially when the population of interest is quite large, is often impractical, too time consuming, or too expensive. Interestingly enough, relatively small samples can give quite good estimates of population values if the samples are selected properly. For example, it can be shown that approximately 1500 randomly selected voters can give reliable information about the entire voting population of the United States.

The goal of sampling is to produce a representative sample , one that has the essential characteristics of the population being studied and is free of any type of bias. We can never be certain that our sample has the characteristics of the population from which it was drawn. Our best chance of making a sample representative is to use some sort of random process in selecting it. It is important to note that “bias” does not mean the same thing as “nonrepresentative.” Bias refers to a method that produces estimates that are either too high on average, or too low on average. Nonrepresentative refers to a particular sample that differs from the population.

Probability Sample

A list of all members of the population from which we can draw a sample is called a sampling frame . We would like the sampling frame to be the same set of individuals we are studying. Unfortunately, this is often not the case. (Think, for example, about how selecting individuals from a phonebook is not the same as all adult residents of a city!) A probability sample is one in which each member of the population has a known probability of being in the sample. Each member of the population may or may not have an equal chance of being selected. Probability samples are used to avoid the bias that can arise in a nonprobability sample (such as when a researcher selects the subjects she will use). Probability samples use some sort of random mechanism to choose the members of the sample. The following list includes some types of probability samples.

random sample: Each member of the population is equally likely to be included.
simple random sample (SRS): A sample of a given size is chosen in such a way that every possible sample of that size is equally likely to be chosen. Note that a sample can be a random sample and not be a simple random sample (SRS). For example, suppose you want a sample of 64 NFL football players. One way to produce a random sample would be to randomly select two players from each of the 32 teams. This is a random sample but not a simple random sample because not all possible samples of size 64 are possible.
systematic sample: The first member of the sample is chosen according to some random procedure, and then the rest are chosen according to some well-defined pattern. For example, if you wanted 100 people in your sample to be chosen from a list of 10,000 people, you could randomly select one of the first 100 people and then select every 100th name on the list after that.
stratified random sample: This is a sample in which the population is first divided into distinct homogenous subgroups called strata (strata in italics) and then a random sample is chosen from each subgroup. For example, you might divide the population of voters into groups by political party and then select an SRS of 250 from each group.
cluster sample:The population is first divided into sections or “clusters.” Then we randomly select the population is first divided into distinct homogenous subgroups called strata (strata in italics) and then a random sample is chosen from each subgroup. For example, you might divide the population of voters into groups by political party and then select an SRS of 250 from each group or clusters, and include all of the members of the cluster(s) in the sample.

example: You are going to conduct a survey of your senior class concerning plans for graduation. You want a 10% sample of the class. Describe a procedure by which you could use a systematic sample to obtain your sample and explain why this sample isn”t a simple random sample. Is this a random sample?

solution: One way would be to obtain an alphabetical list of all the seniors. Use a random number generator (such as a table of random digits or a scientific calculator with a random digits function) to select one of the first 10 names on the list. Then proceed to select every 10th name on the list after the first.

Note that this is not an SRS because not every possible sample of 10% of the senior class is equally likely. For example, people next to each other in the list can”t both be in the sample. Theoretically, the first 10% of the list could be the sample if it were an SRS. This clearly isn”t possible.

Before the first name has been randomly selected, every member of the population has an equal chance to be selected for the sample. Hence, this is a random sample, although it is not a simple random sample.

example: A large urban school district wants to determine the opinions of its elementary schools teachers concerning a proposed curriculum change. The district administration randomly selects one school from all the elementary schools in the district and surveys each teacher in that school. What kind of sample is this?

solution: This is a cluster sample. The individual schools represent previously defined groups (clusters) from which we have randomly selected one (it could have been more) for inclusion in our sample.

example: You are sampling from a population with mixed ethnicity. The population is 45% Caucasian, 25% Asian American, 15% Latino, and 15% African American. How would a stratified random sample of 200 people be constructed?

solution: You want your sample to mirror the population in terms of its ethnic distribution. Accordingly, from the Caucasians, you would draw an SRS of 90 (that”s 45%), an SRS of 50 (25%) from the Asian Americans, an SRS of 30(15%) from the Latinos, and an SRS of 30 (15%) from the African Americans.

Of course, not all samples are probability samples. At times, people try to obtain samples by processes that are nonrandom but still hope, through design or faith, that the resulting sample is representative. The danger in all nonprobability samples is that some (unknown) bias may affect the degree to which the sample is representative. That isn”t to say that random samples can”t be nonrepresentative, just that we have a better chance of avoiding bias. (Remember the difference!) Some types of nonrandom sampling techniques that tend to be biased are:

self-selected sampleor voluntary response sample: People choose whether or not to participate in the survey. A radio call-in show is a typical voluntary response sample.
convenience sampling:The pollster obtains the sample any way he can, usually with the ease of obtaining the sample in mind. For example, handing out questionnaires to every member of a given class at school would be a convenience sample. The key issue here is that the surveyor makes the decision whom to include in the sample.
quota sampling:The pollster attempts to generate a representative sample by choosing sample members based on matching individual characteristics to known characteristics of the population. This is similar to a stratified random sample, only the process for selecting the sample is nonrandom.

Sampling Bias

We are trying to avoid bias in our sampling techniques, which would mean our method chooses samples that produce estimates that are, on average, either too high or too low, which is the tendency for our results to favor, systematically, one outcome over another.

Undercoverage

One type of sampling bias results from undercoverage . This happens when some part of the population being sampled is somehow excluded. This can happen when the sampling frame (the list from which the sample will be drawn) isn”t the same as the target population. It can also occur when part of the sample selected fails to respond for some reason.

example: A pollster conducts a telephone survey to gather opinions of the general population about welfare. Persons too poor to be able to afford a telephone are certainly interested in this issue, but will be systematically excluded from the sample. The resulting sample will be biased because of the exclusion of this group.

Voluntary Response Bias

Voluntary response bias occurs with self-selected samples. Persons who feel most strongly about an issue are most likely to respond. Nonresponse bias , the possible biases of those who choose not to respond, is a related issue.

example: You decide to find out how your neighbors feel about the neighbor who seems to be running a car repair shop on his front lawn. You place a questionnaire in every mailbox within sight of the offending home and ask the people to fill it out and return it to you. About 1/2 of the neighbors return the survey, and 95% of those who do say that they find the situation intolerable. We have no way of knowing the feelings of the 50% of those who didn”t return the survey—they may be perfectly happy with the “bad” neighbor. Those who have the strongest opinions are those most likely to return your survey—and they may not represent the opinions of all. Most likely they do not.

example: In response to a question once posed in Ann Landers”s advice column, some 70% of respondents (almost 10,000 readers) wrote that they would choose not to have children if they had the choice to do it over again. This is most likely representative only of those parents who were having a really bad day with their children when they decided to respond to the question. In fact, a properly designed opinion poll a few months later found that more than 90% of parents said they would have children if they had the chance to do it all over again.

Wording Bias

Wording bias occurs when the wording of the question itself influences the response in a systematic way. A number of studies have demonstrated that welfare gathers more support from a random sample of the public when it is described as “helping people until they can better help themselves” than when it is described as “allowing people to stay on the dole.”

example: Compare the probable responses to the following ways of phrasing a question.

(i) “Do you support a woman”s right to make medical decisions concerning her own body?”

(ii) “Do you support a woman”s right to kill an unborn child?”

It”s likely that (i) is designed to show that people are in favor of the right to choose abortion and that (ii) is designed to show that people are opposed to the right to choose abortion. The authors of both questions would probably argue that both responses reflect society”s attitudes toward abortion.

example: Two different Gallup Polls were conducted in Dec. 2003. Both involved people”s opinion about the U.S. space program. Here is one part of each poll.

Poll A : Would you favor or oppose a new U.S. space program that would send astronauts to the moon? Favor—53%; oppose—45%.

Poll B: Would you favor or oppose U.S. government spending billions of dollars to send astronauts to the moon? Favor—31%; oppose—67%.

(source: http://www.stat.ucdavis.edu/~jie/stat13.winter2007/lec18.pdf )

Response Bias

Response bias arises in a variety of ways. The respondent may not give truthful responses to a question (perhaps she or he is ashamed of the truth); the respondent may fail to understand the question (you ask if a person is educated but fail to distinguish between levels of education); the respondent desires to please the interviewer (questions concerning race relations may well solicit different answers depending on the race of the interviewer); the ordering of the question may influence the response (“Do you prefer A to B?” may get different responses than “Do you prefer B to A?”).

example: What form of bias do you suspect in the following situation? You are a school principal and want to know students” level of satisfaction with the counseling services at your school. You direct one of the school counselors to ask her next 25 counselees how favorably they view the counseling services at the school.

solution: A number of things would be wrong with the data you get from such a survey. First, the sample is nonrandom—it is a sample of convenience obtained by selecting 25 consecutive counselees. They may or may not be representative of students who use the counseling service. You don”t know.

Second, you are asking people who are seeing their counselor about their opinion of counseling. You will probably get a more favorable view of the counseling services than you would if you surveyed the general population of the school (would students really unhappy with the counseling services voluntarily be seeing their counselor?). Also, because the counselor is administering the questionnaire, the respondents would have a tendency to want to please the interviewer. The sample certainly suffers from undercoverage—only a small subset of the general population is actually being interviewed. What do those not being interviewed think of the counseling?

Experiments and Observational Studies

Statistical Significance

One of the desirable outcomes of a study is to help us determine cause and effect. We do this by looking for differences between groups in an experiment that are so great that we cannot reasonably attribute the difference to chance. We say that a difference between what we would expect to find if there were no treatment and what we actually found is statistically significant if the difference is too great to attribute to chance. We discuss numerical methods of determining significance in Chapters 11 –14 .

An experiment is a study in which the researcher imposes some sort of treatment on the experimental units (which can be human—usually called subjects in that case). In an experiment, the idea is to determine the extent to which treatments, the explanatory variable(s), affect outcomes, the response variable (s). For example, a researcher might vary the rewards to different work group members to see how that affects the group”s ability to perform a particular task.

An observational study , on the other hand, simply observes and records behavior but does not attempt to impose a treatment in order to manipulate the response. Therefore in an observational study, we cannot address cause and effect because we have not imposed a treatment.

Exam Tip: The distinction between an experiment and an observational study is an important one. There is a reasonable chance that you will be asked to show you understand this distinction on the exam. Be sure this section makes sense to you.

example: A group of 60 exercisers are classified as “walkers” or “runners.” A longitudinal study (one conducted over time) is conducted to see if there are differences between the groups in terms of their scores on a wellness index. This is an observational study because, although the two groups differ in an important respect, the researcher is not manipulating any treatment. “Walkers” and “runners” are simply observed and measured. Note that the groups in this study are self-selected. That is, they were already in their groups before the study began. The researchers just noted their group membership and proceeded to make observations. There may be significant differences between the groups in addition to the variable under study.

example: A group of 60 volunteers who do not exercise are randomly assigned to one of the two fitness programs. One group of 30 is enrolled in a daily walking program, and the other group is put into a running program. After a period of time, the two groups are compared based on their scores on a wellness index. This is an experiment because the researcher has imposed the treatment (walking or running) and then measured the effects of the treatment on a defined response.

It may be, even in a controlled experiment, that the measured response is a function of variables present in addition to the treatment variable. A confounding variable is one that has an effect on the outcomes of the study but whose effects cannot be separated from those of the treatment variable. A lurking variable is one that has an effect on the outcomes of the study but whose influence was not part of the investigation. A lurking variable can be a confounding variable.

example: A study is conducted to see if Yummy Kibble dog food results in shinier coats on golden retrievers. It”s possible that the dogs with shinier coats have them because they have owners who are more conscientious in terms of grooming their pets. Both the dog food and the conscientious owners could contribute to the shinier coats. The variables are confounded because their effects cannot be separated.

A well-designed study attempts to anticipate confounding variables in advance and control for them. Statistical control refers to a researcher holding constant variables not under study that might have an influence on the outcomes.

example: You are going to study the effectiveness of SAT preparation courses on SAT score. You know that better students tend to do well on SAT tests. You could control for the possible confounding effect of academic quality by running your study with groups of “A” students, “B” students, etc.

Control is often considered to be one of the three basic principles of experimental design. The other two basic principles are randomization and replication .

The purpose of randomization is to equalize groups so that the effects of lurking variables are equalized among groups. Randomization involves the use of chance (like a coin flip) to assign subjects to treatment and control groups. The hope is that the groups being studied will differ systematically only in the effects of the treatment variable. Although individuals within the groups may vary, the idea is to make the groups as alike as possible except for the treatment variable. Note that it isn”t possible to produce, with certainty, groups free of any lurking variables. It is possible, through the use of randomization, to increase the probability of producing groups that are alike. The idea is to control for the effects of variables you aren”t aware of but that might affect the response.

Replication involves repeating the experiment on enough subjects (or units) to reduce the effects of chance variation on the outcomes. For example, we know that the number of boys and girls born in a year are approximately equal. A small hospital with only 10 births a year is much more likely to vary dramatically from 50% each than a large hospital with 500 births a year.

Completely Randomized Design

A completely randomized design for a study involves three essential elements: random allocation of subjects to treatment and control groups; administration of different treatments to each randomized group (in this sense we are calling a control group a “treatment”); and some sort of comparison of the outcomes from the various groups. A standard diagram of this situation is the following:

There may be several different treatment groups (different levels of a new drug, for example), in which case the diagram could be modified. The control group can either be an older treatment (like a medication currently on the market) or a placebo , a dummy treatment. A diagram for an experiment with more than two treatment groups might look something like this:

Remember that each group must have enough subjects so that the replication condition is met. The purpose of the placebo is to separate genuine treatment effects from possible subject responses due to simply being part of an experiment. Placebos are not necessary if a new treatment is being compared to a treatment whose effects have been previously experimentally established. In that case, the old treatment can serve as the control. A new cream to reduce acne (the treatment), for example, might be compared to an already- on-the-market cream (the control) whose effectiveness has long been established.

example: Three hundred graduate students in psychology (by the way, a huge percentage of subjects in published studies are psychology graduate students) volunteer to be subjects in an experiment whose purpose is to determine what dosage level of a new drug has the most positive effect on a performance test. There are three levels of the drug to be tested: 200 mg, 500 mg, and 750 mg. Design a completely randomized study to test the effectiveness of the various drug levels.

solution: There are three levels of the drug to be tested: 200 mg, 500 mg, and 750 mg. A placebo control can be included although, strictly speaking, it isn”t necessary as our purpose is to compare the three dosage levels. We need to randomly allocate the 300 students to each of four groups of 75 each: one group will receive the 200 mg dosage; one will receive the 500 mg dosage; one will receive the 750 mg dosage; and one will receive a placebo (if included). No group will know which treatment its members are receiving (all the pills look the same), nor will the test personnel who come in contact with them know which subjects received which pill (see the definition of “double-blind” given below). Each group will complete the performance test and the results of the various groups will be compared. This design can be diagrammed as follows:

Double-Blind Experiments

In the example above, it was explained that neither the subjects nor the researchers knew who was receiving which dosage, or the placebo. A study is said to be double-blind when neither the subjects (or experimental units) nor the evaluators know which group(s) is/are receiving each treatment or control. The reason for this is that, on the part of subjects, simply knowing that they are part of a study may affect the way they respond, and, on the part of the evaluators, knowing which group is receiving which treatment can influence the way in which they evaluate the outcomes. Our worry is that the individual treatment and control groups will differ by something other than the treatment unless the study is double-blind. A double-blind study further controls for the placebo effect.

Randomization

There are two main procedures for performing a randomization . They are:

Tables of random digits. Most textbooks contain tables of random digits. These are usually tables where the digits 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9 appear in random order (well, as random as most things get, anyhow). That means that, as you move through the table, each digit should appear with probability 1/10, and each entry is independent of the others (knowing what came before doesn”t help you make predictions about what comes next).
Calculator “rand” functions.The TI-83/84 calculator has several random functions: rand, randInt (which will generate a list of random integers in a specified range), randNorm (which will generate random values from a normal distribution with mean μ and standard deviation σ ), and randBin (which will generate random values from a binomial distribution with n trials and fixed probability p —see Chapter 10 ). If you wanted to generate a list of 50 random digits similar to the random digit table described above, you could enter randInt(0,9) and press ENTER 50 times. A more efficient way would be to enter randInt(0,9,50). If you wanted these 50 random integers stored in a list (say L1), you would enter randInt(0,9,10) → L1 (remembering that the→ is obtained by pressing STO ).

Digression: Although the calculator is an electronic device, it is just like a random digit table in that, if two different people enter the list in the same place, they will get the same sequence of numbers. You “enter” the list on the calculator by “seeding” the calculator as follows: (Some number) → rand (you get to the rand function by entering MATH PRB rand) . If different people using the same model of calculator entered, say, 18 → rand , then MATH PRB rand , and began to press ENTER repeatedly, they would all generate exactly the same list of random digits.

We will use tables of random digits and/or the calculator in Chapter 9 when we discuss simulation.

Randomized Block Design

Earlier we discussed the need for control in a study and identified randomization as the main method to control for lurking variables—variables that might influence the outcomes in some way but are not considered in the design of the study (usually because we aren”t aware of them). Another type of control involves variables we think might influence the outcome of a study. Suppose we suspect, as in our previous example, that the performance test varies by gender as well as by dosage level of the test drug. That is, we suspect that gender is a confounding variable (its effects cannot be separated from the effects of the drug). To control for the effects of gender on the performance test, we utilize what is known as a block design . A block design involves doing a completely randomized experiment within each block. In this case, that means that each level of the drug would be tested within the group of females and within the group of males. To simplify the example, suppose that we were only testing one level (say 500 mg) of the drug versus a placebo. The experimental design, blocked by gender, could then be diagrammed as follows.

Randomization and block designs each serve a purpose. It”s been said that you block to control for the variables you know about and randomize to control for the ones you don”t. Note that your interest here is in studying the effect of the treatment within the population of males and within the population of females, not to compare the effects on men and women. Thus, there would be no additional comparison between the blocks—that”s a different study.

Matched Pairs Design

A particular block design of interest is the matched pairs design. One possible matched pairs design involves before and after measurements on the same subjects. In this case, each subject becomes a block on which the experiment is conducted. Another type of matched pairs involves pairing the subjects in some way (matching on, say, height, race, age, etc.).

example: A study is instituted to determine the effectiveness of training teachers to teach AP Statistics. A pretest is administered to each of 23 prospective teachers who subsequently undergo a training program. When the program is finished, the teachers are given a posttest. A score for each teacher is arrived at by subtracting their pretest score from their posttest score. This is a matched pairs design because two scores are paired for each teacher.

example: One of the questions on the 1997 AP Exam in Statistics asked students to design a study to compare the effects of differing formulations of fish food on fish growth. Students were given a room with eight fish tanks. The room had a heater at the back of the room, a door at the front center of the room, and windows at the front sides of the room. The most correct design involved blocking so that the two tanks nearest the heater in the back of the room were in a single block, the two away from the heater in a second block, the two in the front near the door in a third, and the two in the front near the windows in a fourth. This matching had the effect of controlling for known environmental variations in the room caused by the heater, the door, and the windows. Within each block, one tank was randomly assigned to receive one type of fish food and the other tank received the other. The blocking controlled for the known effects of the environment in which the experiment was conducted. The randomization controlled for unknown influences that might be present in the various tank locations.

You will need to recognize paired data, as distinct from two independent sets of data, later on when we study inference. Even though two sets of data are generated in a matched-pairs study, it is the differences between the matched values that form the one-sample data used for statistical analysis.

Exam Tip: You need to be clear on the distinction between the purposes for blocking and randomizing. If you are asked to describe an experiment involving blocking, be sure to remember to randomize treatments within blocks.

Rapid Review

You are doing a research project on attitudes toward fast food and decide to use as your sample the first 25 people to enter the door at the local FatBurgers restaurant. Which of the following is (are) true of this sample?
It is a systematic sample.
It is a convenience sample.
It is a random sample.
It is a simple random sample.
It is a self-selected sample.

Answer : Only (b) is correct. (a), (c), and (d) are all probability samples, which rely on some random process to select the sample, and there is nothing random about the selection process in this situation. (e) is incorrect because, although the sample members voluntarily entered Fat Burgers, they haven”t volunteered to respond to a survey.

How does anexperiment differ from an observational study ?

Answer : In an experiment, the researcher imposes some treatment on the subjects (or experimental units) in order to observe a response. In an observational study, the researcher simply observes, compares, and measures, but does not impose a treatment.

What are the three key components of an experiment? Explain each.

Answer : The three components are randomization, control, and replication. You randomize to be sure that the response does not systematically favor one outcome over another. The idea is to equalize groups as much as possible so that differences in response are attributable to the treatment variable alone. Control is designed to hold confounding variables constant (such as the placebo effect). Replication ensures that the experiment is conducted on sufficient numbers of subjects to minimize the effects of chance variation.

Your local pro football team has just suffered a humiliating defeat at the hands of its archrival. A local radio sports talk show conducts a call-in poll on whether or not the coach should be fired. What is the poll likely to find?

Answer : The poll is likely to find that, overwhelmingly, respondents think the coach should be fired. This is a voluntary response poll, and we know that such a poll is most likely to draw a response from those who feel most strongly about the issue being polled. Fans who bother to vote in a call-in poll such as this are most likely upset at their team”s loss and are looking for someone to blame—this gives them the opportunity. There is, of course, a chance that the coach may be very popular and draw support, but the point to remember is that this is a self-selecting nonrandom sample, and will probably exhibit response bias.

It is known that exercise and diet both influence weight loss. Your task is to conduct a study of the effects of diet on weight loss. Explain the concept ofblocking as it relates to this study.

Answer : If you did a completely randomized design for this study using diet as the treatment variable, it”s very possible that your results would be confounded by the effects of exercise. Because you are aware of this, you would like to control for the effects of exercise. Hence, you block by exercise level. You might define, say, three blocks by level of exercise (very active, active, not very active) and do a completely randomized study within each of the blocks. Because exercise level is held constant, you can be confident that differences between treatment and control groups within each block are attributable to diet, not exercise.

Explain the concept of adouble-blind study and why it is important.

Answer : A study is double-blind if neither the subject of the study nor the researchers are aware of who is in the treatment group and who is in the control group. This is to control for the well-known effect of people to (subconsciously) attempt to respond in the way they think they should.

You are interested in studying the effects of preparation programs on SAT performance. Briefly describe a matched-pairs design and a completely randomized design for this study.

Answer : Matched pairs : Choose, say, 100 students who have not participated in an SAT prep course. Have them take the SAT. Then have these students take a preparation course and retake the SAT. Do a statistical analysis of the difference between the pre- and postpreparation scores for each student. (Note that this design doesn”t deal with the influence of retaking the SAT independent of any preparation course, which could be a confounding variable.)

Completely randomized design : Select 100 students and randomly assign them to two groups, one of which takes the SAT with no preparation course and one of which has a preparation course before taking the SAT. Statistically, compare the average performance of each group.

Practice Problems

Multiple-Choice

Data were collected in 20 cities on the percentage of women in the workforce. Data were collected in 1990 and again in 1994. Gains, or losses, in this percentage were the measurement upon which the study”s, conclusions were to be based. What kind of design was this?
A matched-pairs design
An observational study

III. An experiment using a block design

(a) I only

(b) II only

(d) I and III only

(e) I and II only

You want to do a survey of members of the senior class at your school and want to select a simple random sample . You intend to include 40 students in your sample. Which of the following approaches will generate a simple random sample?

(a) Write the name of each student in the senior class on a slip of paper and put the papers in a container. Then randomly select 40 slips of paper from the container.

(b) Assuming that students are randomly assigned to classes, select two classes at random and include those students in your sample.

(c) From a list of all seniors, select one of the first 10 names at random. The select every n th name on the list until you have 40 people selected.

(d) Select the first 40 seniors to pass through the cafeteria door at lunch.

(e) Randomly select 10 students from each of the four senior calculus classes.

Which of the following is (are) important in designing an experiment?
Control of all variables that might have an influence on the response variable
Randomization of subjects to treatment groups

III. Use of a large number of subjects to control for small-sample variability

(a) I only

(b) I and II only

(d) I, II, and III

(e) II only

Your company has developed a new treatment for acne. You think men and women might react differently to the medication, so you separate them into two groups. Then the men are randomly assigned to two groups and the women are randomly assigned to two groups. One of the two groups is given the medication and the other is given a placebo. The basic design of this study is

(a) completely randomized

(b) blocked by gender

(d) randomized, blocked by gender and type of medication

(e) a matched pairs design

A double-blind design is important in an experiment because

(a) There is a natural tendency for subjects in an experiment to want to please the researcher.

(b) It helps control for the placebo effect.

(c) Evaluators of the responses in a study can influence the outcomes if they know which subjects are in the treatment group and which are in the control group.

(d) Subjects in a study might react differently if they knew they were receiving an active treatment or a placebo.

(e) All of the above are reasons why an experiment should be double-blind .

Which of the following is not an example of a probability sample ?

(a) You are going to sample 10% of a group of students. You randomly select one of the first 10 students on an alphabetical list and then select every 10th student after than on the list.

(b) You are a sports-talk radio host interested in opinions about whether or not Pete Rose should be elected to the Baseball Hall of Fame, even though he has admitted to betting on his own teams. You ask listeners to call in and vote.

(c) A random sample of drivers is selected to receive a questionnaire about the manners of Department of Motor Vehicle employees.

(d) In order to determine attitudes about the Medicare Drug Plan, a random sample is drawn so that each age group (65–70, 70–75, 75–80, 80–85) is represented in proportion to its percentage in the population.

(e) In choosing respondents for a survey about a proposed recycling program in a large city, interviewers choose homes to survey based on rolling a die. If the die shows a 1, the house is selected. If the dies shows a 2–6, the interviewer moves to the next house.

Which of the following is true of an experiment but not of an observational study?

(a) A cause-and-effect relationship can be more easily inferred.

(b) The cost of conducting it is excessive.

(d) By law, the subjects need to be informed that they are part of a study.

(e) Possible confounding variables are more difficult to control.

A study showed that persons who ate two carrots a day had significantly better eyesight than those who ate less than one carrot a week. Which of the following statements is (are) correct?
This study provides evidence that eating carrots contributes to better eyesight.
The general health consciousness of people who eat carrots could be a confounding variable.

III. This is an observational study and not an experiment.

(a) I only

(b) III only

(d) II and III only

(e) I, II, and III

Which of the following situations is a cluster sample?

(a) Survey five friends concerning their opinions of the local hockey team.

(b) Take a random sample of five voting precincts in a large metropolitan area and do an exit poll at each voting site.

(d) From a list of all students in your school, randomly select 20 to answer a survey about Internet use.

(e) Identify four different ethnic groups at your school. From each group, choose enough respondents so that the final sample contains roughly the same proportion of each group as the school population.

Free-Response

You are interested in the extent to which ingesting vitamin C inhibits getting a cold. You identify 300 volunteers, 150 of whom have been taking more than 1000 mg of vitamin C a day for the past month, and 150 of whom have not taken vitamin C at all during the past month. You record the number of colds during the following month for each group and find that the vitamin C group had significantly fewer colds. Is this an experiment or an observational study? Explain. What do we mean in this case when we say that the finding was significant?
Design an experiment that employs a completely randomized design to study the question of whether of not taking large doses of vitamin C is effective in reducing the number of colds.
A survey of physicians found that some doctors gave a placebo rather than an actual medication to patients who experienced pain symptoms for which no physical reason could be found. If the pain symptoms were reduced, the doctors concluded that there was no real physical basis for the complaints. Do the doctors understand the placebo effect ? Explain.
Explain how you would use a table of random digits to help obtain a systematic sample of 10% of the names on a alphabetical list of voters in a community. Is this a random sample? Is it a simple random sample?
The Literary Digest Magazine , in 1936, predicted that Alf Landon would defeat Franklin Roosevelt in the presidential election that year. The prediction was based on questionnaires mailed to 10 million of its subscribers and to names drawn from other public lists. Those receiving the questionnaires were encouraged to mail back their ballot preference. The prediction was off by 19 percentage points. The magazine received back some 2.3 million ballots from the 10 million sent out. What are some of the things that might have caused the magazine to be so wrong (the same techniques had produced accurate predictions for several previous elections)? (Hint: Think about what was going on in the world in 1936.)
Interviewers, after the 9/11 attacks, asked a group of Arab Americans if they trust the administration to make efforts to counter anti-Arab activities. If the interviewer was of Arab descent, 42% responded “yes,” and if the interviewer was of non-Arab descent, 55% responded “yes.” What seems to be going on here?
There are three classes of statistics at your school, each with 30 students. You want to select a simple random sample of 15 students from the 90 students as part of an opinion-gathering project for your social studies class. Describe a procedure for doing this.
Question #1 stated, in part: “You are interested in the extent to which ingesting vitamin C inhibits getting a cold. You identify 300 volunteers, 150 of whom have been taking more than 1000 mg of vitamin C a day for the past month, and 150 of whom have not taken vitamin C at all during the past month. You record the number of colds during the following month for each group and find that the vitamin C group had significantly fewer colds.” Explain the concept of confounding in the context of this problem and give an example of how it might have affected the finding that the vitamin C group had fewer colds.
A shopping mall wants to know about the attitudes of all shoppers who visit the mall. On a Wednesday morning, the mall places 10 interviewers at a variety of places in the mall and asks questions of shoppers as they pass by. Comment on any bias that might be inherent in this approach.
Question #2 asked you to design a completely randomized experiment for the situation presented in question #1. That is, to design an experiment that uses treatment and control groups to see if the groups differed in terms of the number of colds suffered by users of 1000 mg a day of vitamin C and those that didn”t use vitamin C. Question #8 asked you about possible confounding variables in this study. Given that you believe that both general health habits and use of vitamin C might explain a reduced number of colds, design an experiment to determine the effectiveness of vitamin C taking into account general health habits. You may assume your volunteers vary in their history of vitamin C use.
You have developed a weight-loss treatment that involves a combination of exercise and diet pills. The treatment has been effective with subjects who have used a regular dose of the pill of 200 mg, when exercise level is held constant. There is some indication that higher doses of the pill will promote even better results, but you are worried about side effects if the dosage becomes too great. Assume you have 400 overweight volunteers for your study, who have all been on the same exercise program, but who have not been taking any kind of diet pill. Design a study to evaluate the relative effects of a 200 mg, 400 mg, 600 mg, and 800 mg daily dosage of the pill.
You are going to study the effectiveness of three different SAT preparation courses. You obtain 60 high school juniors as volunteers to participate in your study. You want to assign each of the 60 students, at random, to one of the three programs. Describe a procedure for assigning students to the programs if

(a) you want there to be an equal number of students taking each course.

(b) you want each student to be assigned independently to a group. That is, each student should have the same probability of being in any of the three groups.

A researcher wants to obtain a sample of 100 teachers who teach in high schools at various economic levels. The researcher has access to a list of teachers in several schools for each of the levels. She has identified four such economic levels (A, B, C, and D) that comprise 10%, 15%, 45%, and 30% of the schools in which the teachers work. Describe what is meant by a stratified random sample in this situation, and discuss how she might obtain it.
You are testing for sweetness in five varieties of strawberry. You have 10 plots available for testing. The 10 plots are arranged in two side-by-side groups of five. A river runs along the edge of one of the groups of five plots something like the diagram shown below (the available plots are numbered 1–10).

You decide to control for the possible confounding effect of the river by planting one of each type of strawberry in plots 1–5 and one of each type in plots 6–10 (that is, you block to control for the river). Then, within each block, you randomly assign one type of strawberry to each of the five plots within the block. What is the purpose of randomization in this situation?

Look at problem #14 again. It is the following year, and you now have only two types of strawberries to test. Faced with the same physical conditions you had in problem 14, and given that you are concerned that differing soil conditions (as well as proximity to the river) might affect sweetness, how might you block the experiment to produce the most reliable results?
A group of volunteers, who had never been in any kind of therapy, were randomly separated into two groups, one of which received an experimental therapy to improve self-concept. The other group, the control group, received traditional therapy. The subjects were not informed of which therapy they were receiving. Psychologists who specialize in self-concept issues evaluated both groups after training for self-concept, and the self-concept scores for the two groups were compared. Could this experiment have been double-blind ? Explain. If it wasn”t double-blind , what might have been the impact on the results?
You want to determine how students in your school feel about a new dress code for school dances. One group in the student council, call them group A, wants to word the question as follows: “As one way to help improve student behavior at school sponsored events, do you feel that there should be a dress code for school dances?” Another group, group B, prefers, “Should the school administration be allowed to restrict student rights by imposing a dress code for school dances?” Which group do you think favors a dress code, and which opposes it? Explain.
A study of reactions to different types of billboard advertising is to be carried out. Two different types of ads (call them Type I and Type II) for each product will be featured on numerous billboards. The organizer of the campaign is concerned that communities representing different economic strata will react differently to the ads. The three communities where billboards will be placed have been identified as Upper Middle, Middle, and Lower Middle. Four billboards are available in each of the three communities. Design a study to compare the effectiveness of the two types of advertising taking into account the communities involved.
In 1976, Shere Hite published a book entitled The Hite Report on Female Sexuality . The conclusions reported in the book were based on 3000 returned surveys from some 100,000 sent out to, and distributed by, various women”s groups. The results were that women were highly critical of men. In what way might the author”s findings have been biased?
You have 26 women available for a study: Annie, Betty, Clara, Darlene, Edie, Fay, Grace, Helen, Ina, Jane, Koko, Laura, Mary, Nancy, Ophelia, Patty, Quincy, Robin, Suzy, Tina, Ulla, Vivien, Wanda, Xena, Yolanda, and Zoe. The women need to be divided into four groups for the purpose of the study. Explain how you could use a table of random digits to make the needed assignments.

Cumulative Review Problems

The five-number summary for a set of data is [52, 55, 60, 63, 85]. Is the mean most likely to be less than or greater than the median?
Pamela selects a random sample of 15 of her classmates and computes the mean and standard deviation of their pulse rates. She then uses these values to predict the mean and standard deviation of the pulse rates for the entire school. Which of these measures are parameters and which are statistics?
Consider the following set of values for a dataset: 15, 18, 23, 25, 25, 27, 28, 29, 35, 46, 55. Does this dataset have any outliers if we use an outlier rule that

(a) is based on the median?

(b) is based on the mean?

For the dataset of problem #3 above, what is z ₅₅ ?
A study examining factors that contribute to a strong college GPA finds that 62% of the variation in college GPA can be explained by SAT score. What name is given to this statistic, and what is the correlation (r ) between SAT score and college GPA?

Solutions to Practice Problems

Multiple-Choice

The correct answer is (e). The data are paired because there are two measurements for each city, so the data are not independent. There is no treatment being applied, so this is an observational study. Matched pairs is one type of block design, but this is NOT an experiment, so III is false.
The answer is (a). In order for this to be an SRS, all samples of size 40 must be equally likely. None of the other choices does this [and choice (d) isn”t even random]. Note that (a), (b), and (c) are probability samples.
The correct answer is (d). These three items represent the three essential parts of an experiment: control, randomization, and replication.
The correct answer is (b). You block men and women into different groups because you are concerned that differential reactions to the medication may confound the results. It is not completely randomized because it is blocked.
The correct answer is (e).
The correct answer is (b). This is an example of a voluntary response and is likely to be biased. Those who feel strongly about the issue are most likely to respond. The other choices all rely on some probability technique to draw a sample. In addition, responses (c) and (e) meet the criteria for a simple random sample (SRS).
The correct answer is (a). If done properly, an experiment permits you to control the variable that might influence the results. Accordingly, you can argue that the only variable that influences the results is the treatment variable.
The correct answer is (d). I isn”t true because this is an observational study and, thus, shows a relationship but not necessarily a cause-and-effect one.
The correct answer is (b). (a) is a convenience sample. (c) is a systematic sample. (d) is a simple random sample. (e) is a stratified random sample.

Free-Response

It”s an observational study because the researcher didn”t provide a treatment, but simply observed different outcomes from two groups with at least one different characteristic. Participants self-selected themselves into either the vitamin C group or the nonvitamin C group. To say that the finding was significant in this case means that the difference between the number of colds in the vitamin C group and in the nonvitamin C group was too great to attribute to chance—it appears that something besides random variation may have accounted for the difference.
Identify 300 volunteers for the study, preferably none of whom have been taking vitamin C. Randomly split the group into two groups of 150 participants each. One group can be randomly selected to receive a set dosage of vitamin C each day for a month and the other group to receive a placebo. Neither the subjects nor those who administer the medication will know which subjects received the vitamin C and which received the placebo (that is, the study should be double-blind ). During the month following the giving of pills, you can count the number of colds within each group. Your measurement of interest is the difference in the number of colds between the two groups. Also, placebo effects often diminish over time.
The doctors probably did not understand the placebo effect. We know that, sometimes, a real effect can occur even from a placebo. If people believe they are receiving a real treatment, they will often show a change. But without a control group, we have no way of knowing if the improvement would not have been even more significant with a real treatment. The difference between the placebo score and the treatment score is what is important, not one or the other.
If you want 10% of the names on the list, you need every 10th name for your sample. Number the first ten names on the list 0, 1, 2, …, 9. Pick a random place to enter the table of random digits and note the first number. The first person in your sample is the person among the first 10 on the list that corresponds to the number chosen. Then pick every 10th name on the list after that name. This is a random sample to the extent that, before the first name was selected, every member of the population had an equal chance to be chosen. It is not a simple random sample because not all possible samples of 10% of the population are equally likely. Adjacent names on the list, for example, could not both be part of the sample.
This is an instance of voluntary response bias . This poll was taken during the depths of the Depression, and people felt strongly about national leadership. Those who wanted a change were more likely to respond than those who were more or less satisfied with the current administration. Also, at the height of the Depression, people who subscribed to magazines and were on public lists were more likely to be wealthy and, hence, Republican (Landon was a Republican and Roosevelt was a Democrat).
Almost certainly, respondents are responding in a way they feel will please the interviewer. This is a form of response bias—in this circumstance, people may not give a truthful answer.
Many different solutions are possible. One way would be to put the names of all 90 students on slips of paper and put the slips of paper into a box. Then draw out 15 slips of paper at random. The names on the paper are your sample. Another way would be to identify each student by a two-digit number 01, 02, …, 90 and use a table of random digits to select 15 numbers. Or you could use the randInt function on your calculator to select 15 numbers between 1 and 90 inclusive. What you cannot do, if you want it to be an SRS, is to employ a procedure that selects five students randomly from each of the three classes.
Because the two groups were not selected randomly, it is possible that the fewer number of colds in the vitamin C group could be the result of some variable whose effects cannot be separated from the effects of the vitamin C. That would make this other variable a confounding variable . A possible confounding variable in this case might be that the group who take vitamin C might be, as a group, more health conscious than those who do not take vitamin C. This could account for the difference in the number of colds but could not be separated from the effects of taking vitamin C.
The study suffers from undercoverage of the population of interest, which was declared to be all shoppers at the mall. By restricting their interview time to a Wednesday morning, they effectively exclude most people who work on Wednesday. They essentially have a sample of the opinions of nonworking shoppers. There may be other problems with randomness, but without more specific information about how they gathered their sample, talking about it would only be speculation.
We could first administer a questionnaire to all 300 volunteers to determine differing levels of health consciousness. For simplicity, let”s just say that the two groups identified are “health conscious” and “not health conscious.” Then you would block by “health conscious” and “not health conscious” and run the experiment within each block. A diagram of this experiment might look like this:
Because exercise level seems to be more or less constant among the volunteers, there is no need to block for its effect. Furthermore, because the effects of a 200 mg dosage are known, there is no need to have a placebo (although you could)—the 200 mg dosage will serve as the control. Randomly divide your 400 volunteers into four groups of 100 each. Randomly assign each group to one of the four treatment levels: 200 mg, 400 mg, 600 mg, or 800 mg. The study can be and should be double-blind. After a period of time, compare the weight loss results for the four groups.
(a) Many answers are possible. One solution involves putting the names of all 60 students on slips of paper, then randomly selecting the papers. The first student goes into program 1, the next into program 2, etc. until all 60 students have been assigned.

(b) Use a random number generator to select integers from 1 to 3 (like randInt(1,3) ) on the TI-83/84 or use a table of random numbers assigning each of the programs a range of values (such as 1–3, 4–6, 7–9, and ignore 0). Pick any student and generate a random number from 1 to 3. The student enters the program that corresponds to the number. In this way, the probability of a student ending up in any one group is 1/3, and the selections are independent. It would be unlikely to have the three groups come out completely even in terms of the numbers in each, but we would expect it to be close.

In this situation, a stratified random sample would be a sample in which the proportion of teachers from each of the four levels is the same as that of the population from which the sample was drawn. That is, in the sample of 100 teachers, 10 should be from level A, 15 from level B, 45 from level C, and 30 from level D. For level A, she could accomplish this by taking an SRS of 10 teachers from a list of all teachers who teach at that level. SRSs of 15, 45, and 30 would then be obtained from each of the other lists.
Remember that you block to control for the variables that might affect the outcome that you know about, and you randomize to control for the effect of those you don”t know about. In this case, then, you randomize to control for any unknown systematic differences between the plots that might influence sweetness. An example might be that the plots on the northern end of the rows (plots 1 and 6) have naturally richer soil than those plots on the south side.
The idea is to get plots that are most similar in order to run the experiment. One possibility would be to match the plots the following way: close to the river north (6 and 7); close to the river south (9 and 10); away from the river north (1 and 2); and away from the river south (4 and 5). This pairing controls for both the effects of the river and possible north–south differences that might affect sweetness. Within each pair, you would randomly select one plot to plant one variety of strawberry, planting the other variety in the other plot.

This arrangement leaves plots 3 and 8 unassigned. One possibility is simply to leave them empty. Another possibility is to randomly assign each of them to one of the pairs they adjoin. That is, plot 3 could be randomly assigned to join either plot 2 or plot 4. Similarly, plot 8 would join either plot 7 or plot 9.

The study could have been double-blind. The question indicates that the subjects did not know which treatment they were receiving. If the psychologists did not know which therapy the subjects had received before being evaluated, then the basic requirement of a double-blind study was met: neither the subjects nor the evaluators who come in contact with them are aware of who is in the treatment and who is in the control group.

If the study wasn”t double-blind, it would be because the psychologists were aware of which subjects had which therapy. In this case, the attitudes of the psychologists toward the different therapies might influence their evaluations—probably because they might read more improvement into a therapy of which they approve.

Group A favors a dress code, group B does not. Both groups are hoping to bias the response in favor of their position by the way they have worded the question.
You probably want to block by community since it is felt that economic status influences attitudes toward advertising. That is, you will have three blocks: Upper Middle, Middle, and Lower Middle. Within each, you have four billboards. Randomly select two of the billboards within each block to receive the Type I ads, and put the Type II ads on the other two. After a few weeks, compare the differences in reaction to each type of advertising within each block.
With only 3000 of 100,000 surveys returned, voluntary response bias is most likely operating. That is, the 3000 women represented those who felt strongly enough (negatively) about men and were the most likely to respond. We have no way of knowing if the 3% who returned the survey were representative of the 100,000 who received it, but they most likely were not.
Assign each of the 26 women a two-digit number, say 01, 02, …, 26. Then enter the table at a random location and note two-digit numbers. Ignore numbers outside of the 01–26 range. The first number chosen assigns the corresponding woman to the first group, the second to the second group, etc. until all 26 have been assigned. This method roughly equalizes the numbers in the group (not quite because 4 doesn”t go evenly into 26), but does not assign them independently.

If you wanted to assign the women independently, you would consider only the digits 1, 2, 3, or 4, which correspond to the four groups. As one of the women steps forward, one of the random digits is identified, and that woman goes into the group that corresponds to the chosen number. Proceed in this fashion until all 26 women are assigned a group. This procedure yields independent assignments to groups, but the groups most likely will be somewhat unequal in size. In fact, with only 26 women, group sizes might be quite unequal (a TI-83/84 simulation of this produced 4 1s, 11 2s, 4 3s, and 7 4s).

Solutions to Cumulative Review Problems

The dataset has an outlier at 85. Because the mean is not resistant to extreme values, it tends to be pulled in the direction of an outlier. Hence, we would expect the mean to be larger than the median.
Parameters are values that describe populations, and statistics are values that describe samples. Hence, the mean and standard deviation of the pulse rates of Pamela”s sample are statistics , and the predicted mean and standard deviation for the entire school are parameters .
Putting the numbers in the calculator and doing 1-Var Stats , we find that x – = 29.64, s = 11.78, Q1 = 23, Med = 27, and Q3 = 35.

(a) The interquartile range (IQR) = 35 – 23 = 12, 1.5(IQR) = 1.5(12) = 18. So the boundaries beyond which we find outliers are Q1 – 1.5(IQR) = 23 – 18 = 5 and Q3 + 1.5(IQR) = 35 + 18 = 53. Because 55 is beyond the boundary value of 53, it is an outlier, and it is the only outlier.

(b) The usual rule for outliers based on the mean is ± 3s . ± 3s = 29.64 ± 3(11.78) = (-5.7, 64.98). Using this rule there are no outliers since there are no values less than –5.7 or greater than 64.98. Sometimes ± 2s is used to determine outliers. In this case, ± 2s = 29.64 ± 2 (11.78) = (6.08, 53.2) Using this rule, 55 would be an outlier.

For the given data, = 29.64 and s = 11.78. Hence,

Note that in doing problem #3, we could have computed this z -score and observed that because it is larger than 2, it represents an outlier by the x – ± 2s rule that is sometimes used.

The problem is referring to the coefficient of determination —the proportion of variation in one variable that can be explained by the regression of that variable on another.

CHAPTER 8

Design of a Study: Sampling, Surveys, and Experiments

Medical researchers are conducting a study on a new blood pressure medication. There are three treatment groups: one receives the new drug, one receives the current most widely used medication, and a control group receives a placebo. The purpose of the control group is

(A) to ensure that the test results are significant.

(B) to eliminate bias.

(D) to see whether either drug is better than doing nothing.

(E) to make sure the comparison of the new drug to the current drug is statistically valid.

Administrators of a large high school want to survey the student body. Rather than use a simple random sample of students, they want to do a stratified random sample, stratifying by either grade or gender. It would be more advantageous to stratify by grade than by gender if

(A) there is a larger difference of opinion between students of different genders than between students of different grades.

(B) there is a larger difference of opinion between students of different grades than between students of different genders.

(D) students of one gender are more likely to be dishonest when answering the question.

(E) a simple random sample of each grade is easier to collect than a simple random sample of each gender.

Which statement correctly describes a difference between simple random sampling and stratified random sampling?

(A) Simple random sampling is less biased than stratified random sampling.

(B) Stratified random sampling is less biased than simple random sampling.

(D) Stratified random sampling has less variability between different samples than simple random sampling if stratifying is done on a variable that is associated with the variable of interest.

(E) Stratified random sampling has less variability between different samples than simple random sampling if stratifying is done on a variable that is not associated with the variable of interest.

A researcher is conducting a study to see if a new drug will reduce blood pressure more than the current drug. One hundred patients recently diagnosed with high blood pressure have volunteered for the study: 42 of the volunteers are female, and 18 of the volunteers are cigarette smokers. The researcher does not think there is much difference in blood pressure between males and females, but she does think there is a great deal of difference in blood pressure between smokers and nonsmokers. If she uses a randomized block design, she should

(A) block on smoking habits because blood pressure is likely to be associated with smoking habits.

(B) block on smoking habits because the groups are very different in size.

(D) block on gender because the two gender groups are closer to the same size.

(E) block on both gender and smoking habits because the more variables used for blocking, the better.

A high school teacher conducts an action research project as a part of his master”s degree program. He wants to see if a particular teaching method improves student learning on a particular topic. He decides to try one method with his morning class and the other with his afternoon class. The two classes are taught in different rooms. Which of these explains why confounding has entered this experiment?

(A) If one class does outperform the other, the teacher can”t tell whether it is the teaching method or the time of day that caused the difference in performance.

(B) If one class does outperform the other, the teacher can”t tell whether it is the teaching method or the way the groups of students were scheduled that caused the difference in performance.

(C) If one class does outperform the other, the teacher can”t tell whether it is the teaching method or the different classrooms that caused the difference in performance.

(D) Choices A, B, and C all describe possible sources of confounding.

(E) None of these choices describes a possible source of confounding.

Answers