# Advanced Statistics—Semester A

## Let Shmoop be your guide between the Scylla of data and Charybdis of statistics.

*This course has been granted a-g certification, which means it has met the rigorous iNACOL Standards for Quality Online Courses and will now be honored as part of the requirements for admission into the University of California system.*

Statistics gets a bad rap in some circles. People think that stats is all about shysters throwing out a confusing bafflegab of numbers to trick people into agreeing with them. And then they cry out, "Oh no, I've been bafflegabbed!" as their life savings disappear in a puff of white smoke.

As you might expect, we here at Shmoop have a different take on things. Statistics is the study of analyzing data and making inferences about a population. That sounds intimidating, but here's the thing to remember—statistics is just another type of mathematics. And we know all about math. There are rules to math; if you use the right equation at the right time, then you'll get the right answer. And if someone tells you something outrageous, like they've found a negative whole number, then you've caught them bafflegab-handed.

Statistics works the same way. Honest.

In Semester A of this year-long AP® Statistics course, we'll be covering the foundation for all of statistics: data. What they look like, how to analyze them, how to graph them all pretty like, and even how to go out into the wild and gather some data of our own. Without good, solid data, we'd be up Bafflegab Creek without a clue.

Here's a sneak peek at what we'll cover in this semester:

- We'll chat about all the different types of data out there, along with the best ways to display them. You don't graph categorical data with a histogram for the same reasons you don't eat Thanksgiving dinner with a spatula.
- People won't stay awake to hear your conclusions if you insist on reading off all 500 data points you collected. We'll discuss how we can summarize our data using just a handful of numbers.
- Sometimes our data are unruly, so we can use a transformation to get them to settle down.
- Data don't appear out of thin air. Somebody has to go collect them. Once you've identified the population and parameters you're interested in, it's time to conduct a survey, experiment, or observational study to nab a sample.
- Since statistics is about making inferences, we'll need some way to talk about how likely we think different events are. So, we'll wrap up the semester by boning up on probability. Plus, we'll run some simulations, which are a super-handy tool in our stats toolkit.

Of course, it would be hard to learn statistics' rules without a caravan of readings, guided questions, problem sets, and activities, so we've got all that covered, too.

Just so's you know: this is a two-semester course. This is Semester A, and you can find Semester B right here.

### Unit Breakdown

#### Advanced Statistics—Semester A - Visualizing and Describing Data

We'll start our story of statistics off at the beginning—with our data. Whether it's qualitative or quantitative, we'll have it covered. One of the best ways to work with data is visually, so we'll put in plenty of practice time creating all kinds of tables and graphs. Frequency tables, pie charts, bar graphs, histograms, even exotic graphs like time plots, frequency polygons, and ogives.

#### Advanced Statistics—Semester A - Numerical Measures for Quantitative Data

We're pretty busy, so we want to know all the important information about our data in a hurry. That's where measures like the mean, median, mode, interquartile range, variance, and standard deviation come into play. Some of them tell us about where our data are located, some of them show how spread out the values are, and together they give us a pretty good idea of what's going on in a dataset. Adding in box-and-whisker plots and

*z*-scores is just the icing on this data cake.#### Advanced Statistics—Semester A - Comparing Distributions

If you've ever been part of a heated rivalry with one of your classmates, then this unit is for you. We're going to look at how to compare the distributions of two datasets. Who's better at improving students' test scores—Shmoop or our hated rival, Miss Linda Poomhs? Using our various graphing methods, we'll finally settle the score. And then we'll summarize our results using plain English. That way, Linda will know exactly how much better we are than her.

#### Advanced Statistics—Semester A - Bivariate Data: Scatterplots and Correlation

Double the data, double the fun. In this unit, we'll start working with bivariate data. We'll slap our two variables into scatterplots and see what their relationship status is. Instead of checking their Facebook pages, we'll use the correlation coefficient. It's also possible to create a least-squares linear regression, which is a fancy/scary way of saying "a line that fits the data really well." We'll use it to make predictions about the data's behavior (while keeping an eye out for trouble from the residuals and any outliers).

#### Advanced Statistics—Semester A - Function Models and Transformations

Not all bivariate data fits on a nice, easy-to-use line. Two other common results for two variables are when they have a exponential or power relationship. Since we went through all that trouble last unit to learn about lines of best fit, we'll just use some clever math tricks to transform our non-linear data into something a bit less bent.

#### Advanced Statistics—Semester A - Planning and Conducting a Study

It's time to introduce you to your new best friend: randomness. Not because it's "so random lol" and that cracks you up. No, it's because randomly choosing individuals for our sample is what makes it a representative sample, which is the linchpin for doing good statistics. There's more to it than grabbing a blindfold and some darts and going to town, so we'll go in-depth on the different kinds of samples we can take for surveys, experiments, and observational studies.

#### Advanced Statistics—Semester A - Probability and Simulations

What's the probability that we'll round out this semester by talking about probability and simulations? Well, unless the Unit Title Manager (totally a real Shmoop position, apply today) is completely awful at their job, it's probably 100%. We'll use our understanding of sample spaces, dependent and independent events, and simple and compound probabilities to simulate a bunch of events.

#### Advanced Statistics—Semester A - Formal Probability

Put away your tuxedos and fancy dresses, this unit isn't that kind of formal event. Instead, we're gussying up the place by inviting a bunch of equations over. This unit is all about predicting the probability of different events by using the right formulas. Why run a ton of simulations for the statistical equivalent of tying your shoes? Save your CPU cycles for the hard problems.

#### Advanced Statistics—Semester A - Probability Distributions of Random Variables

Variables are like a box full of mystery. What's going to be inside? Have you tried shaking it? We think that

*x*sounds like a 12. Well, random variables up the mystery to another level, because they can take on different values each time we use them. Like when rolling dice, the random variable*X*could be a different number 1 through 6 for each roll. If you're thinking, "Wow, that would be a great way to represent the possible results from sampling a population. And we could graph them, too, to see how likely different outcomes are," then...why are you taking this course again?#### Advanced Statistics—Semester A - Sampling Distributions

So you've taken a sample, measured some variable you're interested in, and calculated an estimate like a mean or proportion. So...what if you did it again? And again and again and again? If you did it enough times, you could graph all those estimates and create its sampling distribution. While that might sound like a Guinness Book of World Records waste of time, the concepts behind the sampling distribution are key to understanding why statistics works at all.

#### Advanced Statistics—Semester A - Estimation and Confidence Intervals

If you've ever wondered how much wood a woodchuck could chuck if a woodchuck could chuck wood, then this is the unit for you. Using a confidence interval, we can estimate, with 95% confidence, the average amount of wood chucking your average woodchuck gets up to. Turns out that it's 1.3 ± 2.6 pieces of wood. In this unit, we'll focus on estimating values from one sample at a time.

#### Advanced Statistics—Semester A - Hypothesis Testing: The Basics

Hypothesis tests are the jelly to the confidence interval's peanut butter. Instead of using our data to pin down an estimate for some value, we ask the data a question. Is the parameter larger than 0? Smaller than 27? Something, anything other than -14? Then we let the data loose until it comes back with an answer. In this unit, we'll cover how to set up our null and alternative hypotheses, how to interpret our results, and how to minimize the chances of making any Errors. Yes, that capitalization is intentional.

#### Advanced Statistics—Semester A - Hypothesis Testing for Comparing Samples and Categorical Data

A lot of interesting questions involve making a comparison between two things. Which tastes better, vanilla or chocolate ice cream? Who's stronger, Batman or Superman? Who has longer lines, Walmart on Black Friday or the DMV on a typical Tuesday? This unit will cover how to run a hypothesis test when we have two samples and need to know which one is harder, better, faster, stronger. We'll also dip our toes into the chi-squared tests, which help us sort through categorical data.

#### Advanced Statistics—Semester A - More Advanced Statistical Tests and Other Topics

We'll wrap up the course with a grab-bag smattering of different topics. We've got the return of linear regression—now with more hypothesis testing and confidence intervals. We'll also dive into making confidence intervals to estimate the difference between two samples. And we'll learn that the real statistics were the friends we made along the way.