Seeds

Determine if the fertilizer used on about half of the 50 pumpkins makes a statistically significant difference in the mean number of seeds produced per pumpkin. This activity models the concept behind the t-test for comparing means.
 * Pumpkin Seeds**



Graph the number of Seeds split by Fertilizer (parallel dot plot) and show the means.
 * Step One**

Ho: The mean number of seeds produced by the fertilized pumpkins is the same as the non-fertilized pumpkins. Ha: The mean number of seeds produced by the fertilized pumpkins is greater than the non-fertilized pumpkins. Our claim is the alternate hypothesis.
 * Hypotheses**

Snap the ruler tool to each of the means. The alternate hypothesis states that the average number of seeds is significantly higher for the fertilized plants, because the difference is in a positive direction. Pull down a case table and create two new variables as shown below. Use the Options menu to show the formulas in the case table. Use the Tab key to move within the “if” command. Use double quotes to indicate that A and B are text entries.
 * Step Two**
 * Step Three**

Graph Seeds split by Group. Designate Group B as the randomly assigned treatment group. Use the ruler tool to find the difference of the means.
 * Step Four**

Click on the History tool and then click on the difference of the means (there will be a light grey box around the number). In the example shown below, the difference between the means is -3618 seeds.

Immediately after clicking on the difference, a new case table appears. The history case table will capture results of repeated random assignments of 22 pumpkins to the treatment group or 28 pumpkins to the control group.

The history table will contain one case. Change the value from 1 to 99 to generate more simulations. Collect a total of 100 simulations. Graph the Diff_Seeds attribute on a new plot.
 * Step Five**
 * Step Six**

Place a vertical reference line at approximately 4676 seeds (from this example). Use the divider tool to create two sections (change the default from three to two) and use the number tool. Record the number of trials out of 100 that fall above the reference line. This plot shows 9 trials. In order to graph a new set of 100 trials, open the options menu and select Delete All History Cases.
 * Step Seven**

Collect 100 new cases (differences between the means) and note the number of differences that fall above the "cut score" of 4676 seeds. Note that the reference line is located at 4,670 seeds instead of 4,676 seeds. That is due to the precision in the plot and the use of such large numbers. To be statistically significant we expect to find, on average, fewer then five cases (differences of the mean number of seeds) above the cut score. This is based on an alpha level of 0.05. After repeating the simulation 6 times (100 in each). I got 9, 12, 7, 9, 8, and 8. This appears to be an average of 9 out of 100 cases.
 * Step Eight**

Based on this information, I decided to fail to reject the null hypothesis. There is not enough evidence that, when left to chance, the mean number of seeds per pumpkin will be significantly greater for the fertilized group. This curve might look familiar to someone who has studied statistics. The data is normally distributed and divided into two regions. This plot shows 1,000 iterations. The percentage in the upper tail is approximately 9%, which is greater than the 5% alpha value commonly used in hypothesis testing. Our goal is to have a percentage that is 5% or less so we are less likely to make a Type I error.
 * Summary**

So here is a question, when you run this simulation a few times, will you get 9%? Why or why not? The next step in the teaching trajectory is to use a more sophisticated program to find the p-value from a t-test (see below). Fathom is one option.

Note that when I run a t-test in Fathom, the p-value is 0.075 or 7.5%. Fathom Dynamic Data Software is currently under revision.
 * FYI**

CODAP Document - Under development Without the UniqueRank Command, I need to think more about how to get correct count in each group.