From 11:00PM PDT on Friday, July 1 until 5:00AM PDT on Saturday, July 2, the Shmoop engineering elves will be making tweaks and improvements to the site. That means Shmoop will be unavailable for use during that time. Thanks for your patience!
We have changed our privacy policy. In addition, we use cookies on our website for various purposes. By continuing on our website, you consent to our use of cookies. You can learn about our practices by reading our privacy policy.
© 2016 Shmoop University, Inc. All rights reserved.

# Grade 8

### Statistics and Probability 8.SP.A.1

1. Construct and interpret scatter plots for bivariate measurement data to investigate patterns of association between two quantities. Describe patterns such as clustering, outliers, positive or negative association, linear association, and nonlinear association.

Many esteemed statisticians have discovered that there appears to be a strange link between Justin Bieber's singing and the population of capuchin monkeys in Argentina. Whenever Bieber hits a high note, there is a spurt in the number of baby capuchins. The higher the frequency of the note, the more baby monkeys are born.

We could list all the different frequencies of Bieber's voice and the number of baby monkeys that were born at those high notes. But that's just a list of numbers—and we most people will probably give up on making sense of it long before they can be convinced.

A better way to see whether there's a relationship between this bivariate data (that simply means two different sets of numbers) would be to make a scatter plot.

Along the bottom, we'll list the frequencies of Bieber's high notes while in concert. On the side, we'll list the number of baby monkeys that were born at those frequencies. Then, we'll put a dot on the graph corresponding to the note and number of monkey births. Turns out, our graph looks like this:

What exactly could we deduce from this data? Well, there certainly seems to be a relationship between high notes and the birth of baby capuchins; the higher the notes, the more babies.

Students should understand scatter plots as ways to communicate relationships between two variables. The more linear the graph, the stronger the correlation. Students should also be able to identify and define outliers and clusters and give possible reasons for their existence. For instance, holding out a shaky high note might result in a few clusters around that particular frequency, not to mention a few fans clustering toward the exit.

Students should also be able to interpret scatter plots as having linear or nonlinear associations, and discern whether these associations are positive or negative. They can think of positive and negative as describing the "slope" of the data. If there's a positive association, both variables increase together. If there's a negative slope, one increases while the other decreases. We don't mean "positive" or "negative" for the capuchin monkey population. Let PETA take that one on.

#### Drills

1. Which type of scatter plot would suggest a positive association?

Correct Answer:

When one variable increases, the other increases

Answer Explanation:

A positive association is like having a line with a positive slope. As the x variable increases (the one on the horizontal axis), the y variable increases (the one on the vertical axis). So (A) is the right answer because both variables are increasing. While a positive association suggests some sort of clear relationship (better than a murky one, right?), it doesn't have to mean a positive—or even linear—association. That's why (D) doesn't quite make the cut.

2. What does an outlier indicate?

Correct Answer:

None of the above

Answer Explanation:

An outlier is an odd piece of information that doesn't follow the trend. Like being a hipster, before being a hipster became the trend. Having an outlier means there must be some sort of trend that this particular point doesn't follow, so (A) can't be right. Of course, the outlier doesn't tell us anything about the correlation between the variables because, as an outlier, it doesn't follow it. So (B) is wrong. While we may want to go with (C), having an outlier doesn't tell us much about the correlation, but we can still determine the correlation from the rest of the data. So (D) is the only answer that makes sense.

3. What is meant by data that "clusters"?

Correct Answer:

Data that all lands in one particular part of graph

Answer Explanation:

Your guess is as good as ours. Not because we don't know, but because it's pretty clear from the word "clusters." Just like cereal clusters that group together in a bunch, data that clusters is a collection of data points all in around the same region of the scatter plot. The word "cluster" doesn't suggest a linear relationship, nor does it indicate data that's spread out. Although it might sound a bit vague, there still exists some sort of pattern in the word "cluster," so (B) isn't right, either.

4. Which would be an example of a positive association?

Correct Answer:

The more hours spent reading, the higher the verbal SAT scores of students

Answer Explanation:

A positive association is like a line with a positive slope or direct variation: when one variable increases, so does the other one. It doesn't necessarily have to have a positive effect. If we look at each choice, we see that (A), (B), and (D) are negative associations because one variable increases (gas prices, Calorie intake, and hours online) while the other decreases (miles driven, temperature, hours of sleep). The only one that doesn't is (C), in which both variables (hours reading and SAT score) increase.

5. Which would be an example of a negative association?

Correct Answer:

The more hours Aunt Tina spends exercising, the lower her weight

Answer Explanation:

Negative associations don't have to be negative in context. All it means is that when one variable increases, the other decreases. In this case, (A) is the only negative association is that between Aunt Tina's exercise regimen and her weight. Her hours of exercise increase, so her weight decreases. While it's a healthy, positive outcome, the association itself is negative because one variable goes up while the other goes down. The rest are positive because both variables in (B), the number of texts and phone bill payment, and both variables in (C), the hours spent playing piano and GPA, increase.

6. What is purpose of a scatter plot?

Correct Answer:

To better observe the correlation between the variables

Answer Explanation:

Scatter plots show us what's happening in a nice, neat little graph. As much as they might want to, they can't prove or verify anything by themselves. Not only that, but it takes way more than a correlation to prove causation, so (A) is definitely wrong. It's also true that scatter plots don't always show linear associations. Sometimes there can be clear-cut relationships between two variables that aren't linear. Since (A) and (C), and therefore (D), are all untrue, (B) is the only answer left over.

7. Which of the following is probably least likely to have a linear relationship?

Correct Answer:

Age and zip code

Answer Explanation:

Aside from some retirement communities (and the whole of Miami), your age is probably not at all related to where you live. Each of the others would probably exemplify a linear relationship much better than (B) because the two variables involved are linked by more than just denture cream. Gross.

8. Which of the following is untrue about scatter plots?

Correct Answer:

If two variables are correlated, their scatter plot will show a linear relationship

Answer Explanation:

While (A) and (B) are true, (C) isn't always the case. The key here is that there's more than one type of correlation because not all correlation between variables has to be linear. For instance, mobility and age aren't linearly correlated but they are still related in some way and would still show up on a scatter plot as something other than a line.

9. Why is it important to label the axes for a scatter plot?

Correct Answer:

To prevent misinterpretation of the data

Answer Explanation:

Without labeled axes, it's easy to confuse the variables or see a relationship that might not actually be there. To help make sure the scatter plot is crystal clear, it's best to always label the axes. It won't make your scatter plot any easier to draw, but it'll certainly prevent misinterpretation of your data. (Both you and your math teacher will be happy about that.)

10. How could data in a scatter plot possibly be wrong?

Correct Answer:

All of the above

Answer Explanation:

It's important to examine both the sources of the statistics and the way they're charted. Sure, we might take accurate measurements and graph them wrong—but it's also possible to graph inaccurate measurements right. That means both (A) and (B) are possible, and (C) is possible as well. Since they're all reasons for why a scatter plot might be incorrect (and there are plenty of other possible reasons, too), we'll go with (D) as the right answer.