# High School: Statistics and Probability

### Making Inferences and Justifying Conclusions HSS-IC.B.3

3. Recognize the purposes of and differences among sample surveys, experiments, and observational studies; explain how randomization relates to each.

Students should be aware of and understand the similarities and differences between the three most common types of data gathering methods: sample surveys, experiments, and observational studies. Students should know that randomization is the process selecting a random sample so that the sample represents the population as accurately as possible.

Sample Surveys

Sample surveys consist of randomly selecting a subset of people from our target population and measuring whatever parameter we're interested in. If these groups of people aren't selected randomly, the integrity of the study may be compromised. How can this happen?

Lazy researchers only sample the people who are easiest to sample. (They'll often make excuses like, "I think I'm stuck to the chair.") Sometimes these easily sampled individuals are predominantly English speakers, adults, or just people nearby. Sampling only from this limited group will yield results that aren't indicative of the larger population.

Biased researchers select certain people based on their preference either for the people or for how they want the study to turn out. (They'll often say things like, "I like you! You're in!") This, of course, isn't random at all. So the results of this random sample won't be useful in analyzing the population as a whole.

If the above non-random samples are used, then the only inferences that can be drawn are about populations similar to the ones tested and not the entire population. In order to be able to draw inferences about the entire population, we need to randomize how we obtain our sample survey.

Experiment

An experiment is a process of trial and error that is used to test a hypothesis. We force a "cause" to observe an "effect." (This doesn't mean that our cause-and-effect relationship is valid; we're just measuring one.) After enough data is collected, we either reject the hypothesis or fail to reject it.

Most experiments we hear about are lab or controlled experiments. It's called "lab" for a reason. In these types of experiments, a control population is compared to a test group to see if there is a significant difference between the two.

The random allocation of test subjects to control and test groups helps ensure randomization and is aptly called a randomized trial. That way, both the control and test groups receive a representative mix of test subjects and the results can apply to the entire population in question.

Double blinding is another method of randomization that ensures that neither the researcher nor the test subject introduces any bias. For example, in a study measuring the effects of the drug, both the doctor administering the experimental medication and the patient receiving it do not know whether the "medication" is the one being tested or the control medication.

Some experiments cannot occur in a lab (oftentimes because there are too many aspects that cannot be controlled in a laboratory environment). These are called field experiments or natural experiments.

An example can be a study of elephant behavior based on weather patterns. It'd be difficult to randomize a study like this based on many factors. Not only that, but it's nigh on impossible to control the weather, let alone isolate behavioral patterns solely due to the weather. Instead of studying the elephants, let's ride 'em!

Observational Study

An observational study is something like a randomized controlled trial (where subjects are randomly assigned to test and control groups), but not quite. In some scenarios, whether due to time, money, or legal or ethical concerns, it's not possible to have a specific control group and a treatment group. That means investigators must rely on existing control and trial groups in order to observe their study outcomes.

For instance, a scientist who wants to study radiation effects on humans 48 hours after a nuclear disaster probably wouldn't get many volunteers. That's not even mentioning the ethical, legal, and financial issues involved.

What can be done instead? An observation study, of course! That's the subject we're on, isn't it? This same scientist may have to wait for a terrible catastrophe to occur, like a nuclear meltdown, before he can begin his observations and compare it to a control group.

As such, observational studies are rarely, if ever, considered random (unlike controlled experiments and probability based sample surveys).

Based on a description of the study, students should be able to identify the kind of study, the methods that promote or hinder randomization, and the effects on the results of the study were certain elements of the study altered.

#### Drills

1. A scientist selects 500 smokers to test how long they can hold their breath. Not surprisingly, the smokers can't hold their breath for long. The average result was a measly 23 seconds. What kind of study was this?

Observational study

No randomization is noted, nor any evidence of double blinding. That means (A), (B), and (D) are out of the question (since (A) and (B) are types of experiments). This is clearly just an observational study designed to measure something specific about smokers.

2. A pharmaceutical company is trying to figure out whether a drug called SmartiePants can make you smarter. (It also tastes like candy. The more you eat, the smarter you can get.) They prepare a double-blind study as follows:

Step 1: A randomly selected pool of individuals will be brought into a clinic and evaluated for any existing health conditions that would disqualify them from the experiment.

Step 2: After passing the health screening the individuals will be split up into two groups: test and controlled.

Step 3: The control group will receive a placebo, but neither the clinician administering it nor the participants know this.

Step 4: The treatment group will receive SmartiePants, but neither the clinician administering it nor the participants know this.

Where is the mistake in this double blind study?

Step 2

A health-screening test is a standard procedure for many medical and legal reasons. Steps 3 and 4 describe double blinding and are accurate. The mistake is in Step 2 because it does not specify whether the individuals are split up randomly into the control and test group or not. If they are not, bias may be introduced into the experiment.

3. Which of the following is a random sample?

Neither (A) nor (B)

Selecting the best athletes is giving preferential treatment and will give the average of the best athletes not the entire track team's average. Even though (B) is a "random" selection of the closest people to you, it's not random throughout the class. (For instance, if you're the class clown and constantly disturbing people sitting next to you, their GPA is likely to be lower than those sitting farther away.)

4. Which of the following is rarely, if ever, random?

Observational study

Experiments and sample surveys should be random for statistical validity. Observational studies cannot (and sometimes do not) concern themselves with randomization for temporal, ethical, legal, financial, or other reasons.

5. Which of the following sample types are valid for a study that measures the average time on the job for all workers of a company?

Randomly selected workers

Workers who are liked by their boss may work fewer hours than those who are disliked by their boss or vice versa, so (B) is invalid. Sampling from workers who work for over forty hours a week explicitly and directly affects the results of the study and is therefore biased. The only valid sample is (A).

6. You decide to play a prank on your friends (in the name of science, of course) and give them laxative brownies. You then time how long it takes for each to have to go to the bathroom. What is this an example of?

An observational study

We doubt any digestively regular scientist would excuse laxative brownies as a legitimate scientific study. Sample surveys consist of sampling a randomly selected portion of a population in order to draw inferences about the entire population. This clearly isn't the case. You observed what happened to your friends, so (C) is right. We don't blame you if you answered (D), though.

7. Why would it be a bad idea to claim the above question was a sample survey?

All of the above

Is your population in question your class? All your friends? In order to conduct a sample survey, the study must be thought out. After defining your population, a group of people must be randomly selected from that pool, something you didn't do before. (Also, feeding laxative brownies to random strangers would be a bad idea in and of itself.)

8. Randomization, or better yet, the lack of proper randomization, may be most influenced by which of the following?

(A) and (C)

While (B) is something that maybe necessary after a study is published, and may catch randomization errors as well, (A) and (C) are likely to introduce bias. Unfortunately both (A) and (C) do happen somewhat often.

9. Which of the following is not randomly chosen?

None of the above

Both (A) and (B) are randomly chosen. That is, unless there are ways you can cheat, which the question and answers do not imply.

10. You decide to conduct and experiment to see which type of engine oil will make your car run better. Where does this experiment make a mistake with respect to bias?

Step 1: Randomly pick four different types of engine oil from the local auto shop.

Step 2: Change out the old oil as best you can to avoid residue.

Step 3: Pour in the new oil, always the same amounts as the previous oil.

Step 4: Drive around for the same distance and amount of time for each different type of oil.