Unit 6: Statistics
Learning Outcomes #38-48
Learning Outcome #38: I can recognize the difference between a statistical and a non-statistical question
-
📺 Watch this video on Statistical Questions
-
✏ Complete these practice exercises on Statistical Questions
Learning Outcome #39: I can describe the distribution of a set of data by correctly using the terms variability, center, cluster, and outlier
-
📺 Watch this video on Shapes of Distributions
-
✏ Complete these practice exercises on Shape of Distributions
-
📺 Watch this video on Clusters, Gaps, Peaks, and Outliers
-
✏ Complete these practice exercises on Practice: Clusters, Gaps, Peaks, and Outliers
Learning Outcome #40: I can find and describe measures of central tendency (mean, median, mode) to analyze data
-
📺 Watch this video on Statistics Intro: Mean, Median, & Mode
-
📺 Watch this video on Mean, Median, Mode Example
-
✏ Complete these practice exercises on Calculating the Mean
-
✏ Complete these practice exercises on Calculating the Median
-
✏ Complete these practice exercises on Calculating the Mean: Data Displays
-
✏ Complete these practice exercises on Calculating the Median: Data Displays
Learning Outcome #41: I can make and interpret a dot plot to analyze data
-
📺 Watch this video on Frequency Tables and Dot Plots
-
✏ Complete these practice exercises on Creating Frequency Tables
-
✏ Complete these practice exercises on Creating Dot Plots
-
✏ Complete these practice exercises on Reading Dot Plots and Frequency Tables
Learning Outcome #42: I can make and interpret a histogram to analyze data
-
📺 Watch this video on Creating a Histogram
-
📺 Watch this video on Interpreting a Histogram
-
✏ Complete these practice exercises on Create Histograms
-
📺 Watch this video on Read Histograms
Learning Outcome #43: I can make and interpret a box plot to analyze data
-
📺 Watch this video on Reading Box Plots
-
✏ Complete these practice exercises on Reading Box Plots
-
📺 Watch this video on Constructing a Box Plot
-
📺 Watch this video on Creating a Box Plot (odd number of data points)
-
📺 Watch this video onCreating a Box Plot (even number of data points)
-
✏ Complete these practice exercises on Creating Box Plots
-
📺 Watch this video on Interpreting Box Plots
-
✏ Complete these practice exercises on Interpreting Box Plots
Learning Outcome #44: I can find the number of observations in a set of data
-
📺 Watch this video on Report Number of Observations in a Data Set
-
✏ Complete these practice exercises on Data Set Warm Up
Learning Outcome #45: I can find apply the mean standard deviation (MAD) to explain explain the level of variability in a data set
-
📺 Watch this video on Mean Absolute Deviation
-
📺 Watch this video on Mean Absolute Deviation Example
-
✏ Complete these practice exercises on Mean Absolute Deviation (MAD)
Learning Outcome #46: I can find the interquartile range (IQR) to explain the level of variability in a data set
-
📺 Watch this video on Interquartile Range (IQR)
-
✏ Complete these practice exercises on Interquartile Range (IQR)
Learning Outcome #47: I can use the IQR and/or MAD to describe the distribution of a set of data
-
Interquartile Range (IQR) measures to the range of the middle 50% of the data in a distribution. We can say that it measures the spread of the middle part of a data set. If the IQR is large then we can hypothesize that the data set has a lot of variability. Another way to say this is that if the IQR is large then we can hypothesize that the data set has little consistency.
To overly simplify:
Large IQR --> high variability
Small IQR --> low variability
-
The Mean Absolute Deviation (MAD) measures the average distance from each data point to the mean. If the MAD is large then the average distance to the mean is large and we can say that the data set has a lot of variability. If the MAD is small then the average distance to the mean is small and we can say that the data set has only a little variability.
To overly simplify:
Large MAD --> high variability
Small MAD --> low variability
Learning Outcome #48: I can choose an apporpriate measure of center and variability and explain why my measurement is most appropriate
-
📺 Watch this video on Impact on Median and Mean: Removing an Outlier
-
📺 Watch this video on Impact on Median and Mean: Increasing an Outlier
-
✏ Complete these practice exercises on Effects of Shifting, Adding, and Removing a Data Point
-
✏ Complete these practice exercises on Choosing the "Best" Measure of Center
-
📺 Watch this video on Interpreting Box Plots
-
✏ Complete these practice exercises on Interpreting Box Plots
Important Notes on Mean vs Median
-
If a data set is symmetrical we often use the mean or average to describe the typical value or center in a data set. The mean redistributes the data evenly among all the data points so if most of the data is relatively similar to each other then the mean usually results in a number that is representative of most of the data.
-
If a data set is very skewed or has a few outliers then we often use the median to describe the typical value or center of a data set. The median is less affected by extreme values than the mean.
-
If there is a data set with a long right tail then the average will be larger than the mean because there are a few large values that increase the sum of the data and subsequently "pull up" the average.
-
If the data set has a left tail then the average will be smaller than the mean because a few small values decrease the sum of the data and subsequently "pull down" the average.
