Common Core Standards: Math
2. Decide if a specified model is consistent with results from a given data-generating process, e.g., using simulation. For example, a model says a spinning coin falls heads up with probability 0.5. Would a result of 5 tails in a row cause you to question this model?
Students should understand that sometimes, weird things happen in statistics. Not spooky weird, but an almost unnatural weird. Like flipping a coin ten times and ending up with ten heads. Odd.
Long ago, statisticians would seek the help of mental health professionals to figure out if they were crazy when these strange events kept happening. The mental health industry has been devastated ever since statisticians came up with standardized mathematical ways that help explain these phenomena. (The correlation between statisticians and the mental health industry has yet to be statistically confirmed.)
Students shouldn't freak out when things don't go exactly according to plan. There are ways to test whether results fit nicely into a statistical model or not. In statistics, the options for numerical tests are as numerous and appealing as germs on a hotel room comforter. By the way, good luck sleeping on your next vacation.
Students should be familiar with Goodness of Fit tests (which aren't the same tests used in JC Penney changing rooms). Students should also know that these tests help measure whether or not a statistical model fits certain observations.
The Chi-Squared Goodness of Fit Test (called the Chi-Squared test, for short) assumes that any discrepancy within our data is the cause of chance rather than a faulty model. We can use the Chi-Squared test provided a large enough population, an appropriate random sample, and all that other good stuff that comes with proper statistical studies.
Students should know how to calculate the value of χ2, where
In the formula, O is our observed frequency value and E is our expected frequency value.
Students also need to find the degrees of freedom, which equals the number of categories in our sample minus 1. So if we have 4 different kinds of fruit, that means we have 3 degrees of freedom. Simple enough.
As a side note, statisticians love tables. Not round tables or square tables or three-legged tables. We mean tables of values. Never-ending columns and rows of numbers upon numbers. Whatever floats their boat, right?
Students don't have to like these tables, but they should know how to use them. By that we mean compare our χ2 value to the number corresponding the degrees of freedom and significance level p = 0.05 on the table. If χ2 is larger, then our data doesn't quite match the model. If χ2 is less than the critical value (the one given by the table), the model works well enough.