Skip to main content
Create interactive lessons using any digital content including wikis with our free sister product
. Get it on the
Pages and Files
Measures of Center
Graphs, Center and Spread
Median Absolute Dev
Mean Absolute Dev
Time Series Plots
Central Limit Theorem
Confidence Intervals (revised 2016)
Graduate Student Projects
Fathom Dynamic Data
measures of variation
median absolute deviation
Median Absolute Deviation Tutorial
This investigation uses TinkerPlots Dynamic Data Exploration (Version 2) to help students understand and calculate the Median Absolute Deviation as a measure of variation. This tutorial includes two main tasks with specific objectives. Fostering conceptual understanding is dependent on completing the entire sequence of tasks in the order presented.
Task A – Build understanding of the data set and point estimates
Discuss the context of the data set and how the data was generated.
Graph the information with parallel dot plots and parallel box plots.
Find measures of center (mean, median) and spread (interquartile range) to compare two populations.
Task B – Build Conceptual Knowledge of the Median Absolute Deviation
Find the difference between the median and a single data point, for each data point.
Find the absolute value of each “difference from the median” using both a table and a graph.
Find the median of the absolute differences in a table and a graph.
Common Core State Standards Addressed
Grade Six Statistics and Probability #6.SP.4
Display numerical data in plots on a number line, including dot plots, histograms, and box plots.
Grade Six Statistics and Probability #6.SP.5
Summarize numerical data sets in relation to their context, such as by:
Reporting the number of observations.
Describe the nature of the attribute under investigation, including how it was measured and it units of measurement.
Giving quantitative measures of center (median and/or mean) and variability (interquartile range and/or mean absolute deviation), as well as describing any overall pattern and any striking deviations from the overall pattern with reference to the context in which the data were gathered.
Relating the choice of measures of center and variability to the shape of the data distribution and the context in which the data was gathered.
#2 Model the situation
#5 Appropriate Tool use
TinkerPlots Dynamic Data Exploration (Version 2)
File |Open Sample Document | Data and Demos | Backpacks. tp
Optional – Scale and student backpacks
Question 1. What is the context of the data set?
The body weight and pack weights of 79 students in a specific school were recorded in a TinkerPlots collection. Each distinct object in a data collection is represented with a data card. The card in figure 1 shows the information (variables) for a student named Jim. The first three categorical variables provide demographic information about the students (name, gender, and grade level). The numeric data was collected with a scale using pounds as the unit of measure. Students stood on a scale to find their body weight and then measured their backpacks on the scale. An additional variable was created in TinkerPlots to find the percent of the pack weight to the student’s body weight. It is not clear what sampling method was used so the results should not be generalized to the population of all elementary students. The results discussed here are specific to the students surveyed in this school, in this town and this part of the country.
Caution – There is no information given that indicates a random sampling of students was performed.
Hint - You may delete the two text boxes to create more workspace.
Figure 1. Sample data card
Question 2. Describe the center of the data set with a single numeric value.
To make this question more interesting, I separated the pack weight data into two parallel dot plots by placing
-axis. The mean and median tools were used to show the mean and median pack weight values for males and females. The distribution of the data shown in the dot plots indicates that the data has a left wall so it is skewed to the right. What this means is that there are a few students who carry backpacks that are unusually heavy, especially the student with the 39-pound pack.
The real question is
what would make a better choice for the center, the mean or the median?
I will choose the median because the mean is affected by the packs that are unusually heavy. The average pack weight for this group of males is 8.5 pounds and for this group of females the average pack weight is 7 pounds.
Hint – Dot plots are created by first separating all icons horizontally and
stacking them vertically. Be sure to select the PackWeight attribute! To create the parallel plot shown here, drag Gender to the vertical axis.
Hint – Use the mean/median menu to Show Numeric Values.
Figure 2. Measures of center for male and female pack weights
Question 3. Describe the variation of the data set with a single numeric value.
This question can be quickly answered by using one of two values, the range or the interquartile range. The range is the difference between the minimum and maximum values.
If we use the range (max – min) to report the variability, we would report 36 pounds for females and 24 pounds for males. Because of the unusual value of 39 pounds, this is an example of why the range is not a reliable estimate for variation. Figure 3 shows the range for the female packs.
Hint – Snap the ruler tool to the individual icons to measure the range. Note the value is recorded in the lower left corner of the window.
Figure 3. Range for female backpack data
The Interquartile range is the difference between the third and first quartiles (75th percentile and the 25th percentile). Students can find the interquartile range using a
tools in TinkerPlots
In figure 4, the ruler was attached to the first and third quartiles of the box plot for male pack weight. The Interquartile range is 10 pounds for the males. The process was repeated for the females to find an Interquartile range of 8 pounds. The lower IQR for the female packs makes
sense because a visual inspection of the variability is slightly less for the female packs (males have five packs over 20 pounds versus three for the females).
Hint - Hide the icons with the icon menu (in the plot's menu bar).
Figure 4. Interquartile range for male pack weight
The median (or mean) absolute deviation formula can be used to represent variation within a given data set. I am going to use the median absolute deviation due to the unusually high value of 39 pounds (suspected outlier). I will illustrate the calculations with two different representations; a
with formulas and a
with the ruler tools,
Hint - I used the female data set for this discussion. Open a new Backpacks.tp file. Delete the male student data and delete the percent weight attribute. SAVE this new file with a new name!
Question 1. What is the deviation from the median for
I need to find the deviation (difference) between each pack weight and the median. Figure 5 shows a graphic representation of the difference between Merinda’s 19-pound pack and the median of 7 pounds. The figure also shows a Case Table with two additional attributes. The formulas for Difference and Abs_Diff can be added in the Data Card or in the Case Table.
What you might notice is that all of the differences to the right of the median in the dot plot will be positive, but the differences to the left will be negative. For example, Wendie (row 22) has a 5-pound pack so the difference from the median is -2 pounds. This is a problem since we want to find the
(in this case we will use the median as our measure of center).
Our goal is to find the
of the differences. If I add up all of the differences, the negative values will cause the sum to be inaccurate. This is where the term “absolute” comes into the formula of the MAD. If I take the absolute value of the differences, the deviations will all be positive. Note that the plot has a
menu item that can be used to find the sum of all of the deviations. This will be discussed next.
Figure 5. Deviation from the median
Question 2. What is the sum of the deviations from the median for
all of the
Figure 6a illustrates that the
ruler tool actually measures each individual difference (similar to the Difference attribute in the Case Table). To achieve this, do not "stack" the icons in the dot plot. In figure 6b, the ruler tool menu was used to find the sum of the deviations or differences for a stacked dot plot (left plot) and the sum of the
of the differences (right plot). Note that the formulas shown in the lower left corner of the plots were found by clicking on the
button. Recall from the discussion above that in order to find an accurate sum of the differences, they must all be positive values because our goal is to find out how far the points are, on average, from the median of 7 pounds.
Figure 6a. Sum of deviations from the median in an unstacked dot plot
Figure 6b. Sum of deviations in the stacked dot plot
Figure 7 shows the options used to create the second plot in figure 6. In question three, I will use the option titled “Median of Differences” to automatically calculate the MAD.
Figure 7. Ruler Menu
Question 3. What is the median absolute deviation for the pack weights?
To illustrate the answer to this question for the female pack weights, I am going to return to the
and plot the Abs_Diff attribute (see Figure 8 - top plot). The goal is to find the medan of the differences which can be done with the median tool. Note that the same value, 3 pounds, is calculated when both the
Median of Differences
are selected in the ruler tool's option menu (Figure 8 - bottom plot). Note the syntax of the formula in the lower left corner of the plot "Median of | Diff | of 39 cases = 3"
Figure 8. Median of the deviations for the female pack weights from the plot (top) and ruler tool (bottom).
In figure 9, the ruler tool was used to find the MAD for both male and female backpack data. The calculation shows the variability for female packs (3 pounds) is slightly less then males (3.5 pounds). A more interesting comparison can be made when I investigate the MAD for each grade level as shown in figure 10. The MAD for grades five and seven are three times the MAD for grades one and three.
Figure 9. Median Absolute Deviation for female and male pack weight data.
Here is a help
Figure 10. Median Absolute Deviation for grade level data.
This concludes the Median Absolute Deviation Tutorial. Please contact me if you have suggested improvements for this page of the Wiki, questions about the technology or pedagogical approach, or additional activities with different data sets that you would like to share.
help on how to format text
Turn off "Getting Started"