SET 2016 - Statistical Education of Teachers -- Middle Grades
Illustrative Example
The following example illustrates the complete statistical problem-solving process at the level expected of a middle-school teacher.

Formulate Questions
Statistical investigations undertaken in elementary school are typically based on questions posed by the teacher that can be addressed using data collected within the classroom. In middle school, the focus expands beyond the classroom, and students begin to formulate their own questions. Because many investigations will be motivated by students’ interests, middle-school teachers must be skilled at constructing and refining statistical questions that can be addressed with data.
For example, suppose a student is planning a project for the school’s statistics poster competition. The student recently read that consumption of bottled water is on the rise and wondered whether people actually prefer bottled water to tap or if they could even tell the difference between the two. When asked for advice about how to conduct a study, the teacher suggested having individuals drink two cups of water—one cup with tap water and one cup with bottled water. For each trial, the bottled water would be the same brand and the tap water would be from the same source. Not knowing which cup contained which type of water, each participant would identify the cup he/she believed to be the bottled water. Thus, a statistical question that could be investigated would be:
Are people more likely than not to correctly identify the cup with bottled water?

Collect Data
Teachers must think carefully about how to collect data to address the above statistical question and how to record the data on participants. As the statistical question requires data on the categorical variable “whether or not the individual correctly identified the cup with bottled water,” each participant should be asked to identify the cup he/she believes to be the bottled water, and, based on the response, the student would record a value of “Correct” (C) or “Incorrect” (I).

In this illustration, the student asked 20 classmates from her school to participate in the study. Each participant was presented with two identical cups, each containing 2 ounces of water. Each participant drank the water from the cup on the right first and then drank the water from the cup on the left. Unknown to the participants, the cup on the right contained tap water for half the participants, and the cup on the right contained bottled water for the other half. Each participant identified which cup of water he/she considered to be the bottled water. Following are the resulting data: C, I, I, C, I, I, C, I, C, C, I, C, I, C, C, I, C, C, C, C.

Analyze Data
Data on a single categorical variable are often summarized in a frequency table and bar graph indicating the number of responses in each category. The frequency table and bar graph for the above data are displayed below:
Frequency (Count)
TinkerPlots: Design a Sampler with two options and run it until the results match the results of 12 correct and 8 incorrect. Then, create a bar graph.


Interpret Results
Although more than half the participants in the study correctly identified bottled water, it is still possible that participants could not tell the difference and were simply guessing. If the participants were randomly guessing, the probability of a participant selecting bottled water would be 0.5, and we would expect about 10 of the 20 participants to correctly identify bottled water. However, this doesn’t guarantee that exactly 10 people will be correct, because there would be random variation in the number correct from one group of 20 participants to another.
This is similar to the idea of flipping a fair coin 20 times. Although we expect to get 10 heads, we are not surprised if we get 9 or 11 heads. That is, there is random variation in the number of heads we get if a fair coin is tossed 20 times. Thus, to decide whether people can tell the difference between tap water and bottled water, we must determine whether the observed statistic—“12 out of 20” correctly identifying bottled water—is a likely outcome when students are guessing and their selections are completely random. This is an important question, often asked as part of this component of the statistical problem-solving process: “Is the observed statistic a likely (or unlikely) outcome from random variation if everyone is simply guessing?”

The answer to this question is at the heart of statistical reasoning. If the observed statistic is a likely outcome, then random variation provides a plausible (believable) explanation for the observed value of the statistic and we conclude people may be guessing. If the observed statistic is an unlikely outcome, then this suggests the observed value of the statistic is due to something other than just random variation. In this case, the difference between the observed and expected values of the statistic is said to be statistically significant and we would conclude that people are not guessing. That is, people are more likely than not to correctly identify the cup with bottled water.
One way to address this question is to develop a model (a simulation model or a theoretical probability model) for exploring the long-run behavior of the statistic. For example, a simulation model for exploring the random variation in the statistic “the number that correctly select bottled water when participants are randomly guessing” would be to toss a fair coin 20 times. A coin-toss that results in a “head” corresponds to correctly identifying the bottled water. For each trial (20 tosses of the coin), record the number of heads. The dotplot below summarizes the results for the “number of heads” from 100 trials of tossing a coin 20 times. [I created a model with 1,000 sets of 20 coin flips.]

TinkerPlots: Run the fair Sampler 1000 times with the History tool. Turn off the animation for quicker results.
Then, graph the results in a dot plot. Add a divider with three sections to isolate the range of Reasonable Guesses."


Based on the dotplot, getting 12 or more heads occurred in 19 of the 100 trials. So, if the coin is fair, the probability of getting 12 or more heads would be estimated at 0.19 based on this simulation. Thus, if participants cannot tell the difference and are randomly guessing, then 12 out of 20 people correctly identifying bottled water would not be a surprising outcome. Applets for performing a simulation such as this are widely available (e.g., Using an applet, 10,000 repetitions of tossing a fair coin 20 times yielded a dot plot similar to what is shown above for the number of heads from each repetition. In the simulation, only about 4% of the 10,000 trials resulted in a number of heads that differed from the expected value by more than 4 heads. So, when flipping a fair coin 20 times, the probability of getting between 6 and 14 heads (inclusive) would be estimated to be 0.96. Thus, we can be fairly confident that the statistic (observed number of heads) will be within 4 of the expected number of heads (10). This value (±4), called the margin of error, tells us how much the statistic is likely to differ from the parameter due to random variation.

As with the previous simulation, obtaining 12 heads (12 participants correctly identifying bottled water) is not a surprising outcome. Thus, because 12 out of 20 appears to be a likely outcome when the selection is random, the evidence against guessing is not very strong. Therefore, it is plausible that participants could not tell the difference between bottled water and tap water and were guessing which cup contained the bottled water.
Note that the statistical preparation of middle-school teachers may include a more structured approach to solving this problem. This approach would consist of translating the statistical question into statements of the null and alternative hypotheses, estimating the p-value from the simulation, and using the p-value to describe the strength of the evidence against the hypothesis students are guessing. Also, this example could be expanded easily to one appropriate for preparing a high-school teacher. This expansion would include using the binomial probability distribution as a mathematical model for describing the random variation in the number of heads out of 20 tosses and determining the exact p-value associated with the observed statistic.