Sampling Distribution

Strap in. This one gets a little hairy.

A sampling distribution is a distribution (typically a scatterplot) of statistics (like means or proportions or standard deviations, etc.) taken from repeated samples from the same population as a way to help estimate that population mean, proportion, standard deviation, or other value.

So, an anonymous donor dropped off a huge box of Skittles on our doorstep. When we opened the box it looked like there were more green Skittles in the box than the 20% the company states. We have two options: count all the Skittles to determine the exact percentage of greens (nope!) or takes small samples from the box (after shaking it vigorously each time) and use those sample to estimate the percentage of greens (yup!).

So we take a sample, find the percentage of greens, and plot that percentage. Then we take another same-sized sample (after shaking...always shaking in between) and plot it. Then we do it again and again and again like an infinity number of times. The resulting distribution is a sampling distribution (or a distribution made of same-sized samples from the same population) that we an analyze to estimate the true percentage of greens in the box.

This begs the questions, though, of why not just count everything in the box? Sometimes, we can’t count everything in the box like when we want to estimate the mean number of children per household in America. We can’t contact every family in America, so we take a sample instead, use some fancy mathematics that create the sampling distribution for us, and use that to estimate the mean number of kiddos people have.

Find other enlightening terms in Shmoop Finance Genius Bar(f)