# High School: Statistics and Probability

### Using Probability to Make Decisions HSS-MD.A.4

4. Develop a probability distribution for a random variable defined for a sample space in which probabilities are assigned empirically; find the expected value. For example, find a current data distribution on the number of TV sets per household in the United States, and calculate the number of sets per household. How many TV sets would you expect to find in 100 randomly selected households?

Students should be able to work backwards to figure out how empirical data will help them make approximations about samples and populations. Now, we're working with empirical (as in real) data and not theoretical data. That means it's time for the real world.

In the actual real world, outside of the cameras of reality TV, we may have problems where theoretical probabilities don't exist. Students must use sample data to arrive at their own actual probabilities, find expect values, and so on. That means knowledge of the equations P(X = a) = nCapaqna and E(x) = x1p1 + x2p2 + … + xipi.

The best way to demonstrate this to students is to give them data (and lots of it), and have them figure out and calculate probabilities based on observation. They should also find expected values and calculated potential payoffs if necessary. This data can be made by you or simply searched for online to find actual scientific empirical data.

Students usually find real scientific data somewhat boring, so anything to do with making money (casinos, gambling, or the lottery) will usually spark the students' interest more than genetic sequencing of fruit flies. If you're determined to use those fruit flies, consider making them tap dance or something.

#### Drills

1. A casino executive is suspicious that some of his roulette wheels are not truly random. He feels that some players have taken advantage of this scenario and are making a lot of money. He told his technical advisor to find out through observation of 300 test spins how many times the ball lands in a certain number slot. The findings are as follows:

 # on Roulette Wheel # of hits 1 3 2 2 3 5 4 1 5 7 6 34 7 1 8 2 9 86 10 2 11 6 12 3 13 4 14 1 15 0 16 0 17 6 18 0 19 4 20 0 21 3 22 2 23 5 24 1 25 6 26 23 27 55 28 7 29 3 30 2 31 8 32 4 33 1 34 5 35 8

Calculate the expected value. Based on the data and expected value, of what use is the expected value?

Very little as it only provides a weighted average of what number will occur, not the actual numbers that maybe skewed

We get our P(X) by averaging the number of hits into the total number of hits divided by 300. For instance, X = 1 for number 1 on the roulette wheel, and its number of hits (3) translates to a p value of 0.01 (). That means the product of the two is 0.01. If such calculations are performed for every roulette number and added, we get E(x) = 17.03, which doesn't correlate to anything, really. The answer, then, is (A).

2. If you are unable to find a theoretical probability of an event occurring, then the best way to go about finding out what the probability of an event is to do which of the following?

Observe for your event for as long as possible and note when it does or doesn't occur

Answer (B) is incorrect because you would only be marking down successes at that point, which is not how you should go about getting empirical probabilities (the number of successes out of the total number of tries). It is still possible to solve the problem, and a guess is not mathematically viable. Unfortunately.

3. You really want to find out and see whether you can make a living buying these scratch-and-win tickets. If you match at least 5 out of 10 numbers you win a prize; otherwise you win nothing. Being savvy, you buy 100 tickets to observe the expected probabilities. They come out as follows:

 # Identical Numbers Frequency 0 14 1 17 2 14 3 21 4 13 5 17 6 0 7 0 8 1 9 0 10 3

Calculate the probability in decimal form of getting at least 5 matching numbers on your scratch and win ticket.

0.014

We need to sum up the probabilities for getting 5, 6, 7, 8, 9, and 10 identical numbers overall. To do so, we divide the frequency of the tickets by 100. Then, we can calculate P(X = a) = nCapaqna where n = 10 and a goes from 0 to 10. If we add P(5), P(6), P(7), P(8), P(9), and P(10), we'll find the probability of getting at least 5 identical numbers. Doing so gives us a probability of about 0.014.

4. If you place 3 different banners on a special website that advertises your home business you will get different click through rates. The more click through rates you get the more sales you get. Hopefully, anyway. What is the expected value of our sales per click for all 3 ads combined?

 Ad Clicks Dollars Earned Dollars per click P(dollars per click) 1 28,472 38,472 1.351 0.5 2 18,927 190,038 10.041 0.1 3 36,837 73,829 2.004 0.333333333

\$2.35 per click

We're given the probability of the amount of money earned for each click depending on the ad. Those numbers are our p values. Our X values are the various dollars earned per click for each individual ad. If we add the products of X and p and find E(x), we get 2.35. Answer (C) is incorrect because we're finding the expected value of the average amount of dollars per click we receive, not the average amount of money made per sale. All the other answers would only occur by mathematical mistake.

5. The CEO of a large hotel corporation has found out through numerous tests that the probability that her guests will buy a meal in her hotel's restaurant if a flyer is left in their room is 0.34. If a flyer is not left in the room, the probability is only 0.12. What is this an example of?

Empirical probabilities

The data was gathered, not calculated based on some theoretical standard. That means it was empirical rather than theoretical. The restaurant might serve portabella mushrooms, but it has little to do with the statistics behind the association of flyers to restaurant attendance.

6. A statistician working for Chocolicious has the task of figuring out what the probability is that the chocolate machines will malfunction and mix the wrong batches of chocolate powder. Each mix-up costs the company \$5,000. The statistician has calculated that this probability is 1 in 10 daily starts for production. What is the expected value of this loss for an entire month (30 days)?

\$15,000

Using the statistician's probability of 0.1, we use the formula E(x) = x1p1 + x2p2 + … + xipi to get our answer of \$15,000.

7. What are the chances that in 30 days, the machine does not malfunction at all?

0.04

Using the formula P(X = a) = nCapaqna where n = 30, p = 0.1, q = 0.9, and a = 0, we can find that P(0) = 0.042. The closest answer is (B).

8. A volcanologist is interested in finding data that helps calculate the probability of a dormant volcano erupting. Assume that there are 100 dormant volcanoes on earth. The volcanologist has calculated that there is a 0.001 probability of a dormant volcano going off in the next year based on prior empirical data. What is the probability that 2 of them will go off in the next year?

0.004

We could find the probability distribution with n = 100, a = 2, p = 0.001, and q = 0.999. If we use the formula P(X = a) = nCapaqna, we get P(2) = 0.00449, or about 0.004. This means the right answer is (D).

9. The payoff of a slot machine hitting a \$1,000-jackpot is 0.05 every time you play the machine. Each play costs \$50. What is the expected value after five tries?

\$0

Using the formula for the expected value, E(x) = x1p1 + x2p2 + … + xipi, we can reduce the calculation to 5 • \$1,000 • 0.05 = \$250. Since we'd have to pay \$50 each time though, we need to subtract 5 • 50, which means we end up with \$0.

10. If you are unable to come up with a theoretical mathematical probability for an outcome, what is the best solution?