Students

Teachers & SchoolsStudents

Teachers & SchoolsWhat is probability distribution? Statistical analysis is a study that compiles and breaks down data gathered from a wide range of categories and criteria. When attempting to assess the impact of a variable on all of the different possible outcomes as the variable relates to a host of other factors and scenarios, there are certain outcomes that are more likely than others. Probability distribution is the process of organizing these outcomes in a graph or other display to show these outcomes’ likelihood when referencing the constant factors vs. the variable.

College and Career | Personal Finance |

Courses | Finance Concepts |

Finance | Finance Definitions Financial Responsibility Personal Finance |

Finance and Economics | Terms and Concepts |

Language | English Language |

Life Skills | Finance Definitions Personal Finance |

Social Studies | Finance |

Subjects | Finance and Economics |

Terms and Concepts | Company Management Econ Education Ethics/Morals Financial Theory Investing Metrics Mutual Funds Stocks Tech Trading |

sell your screenplay in the next five years for a hundred grand over here just

right in the middle is the probability you sell your screenplay for a dollar

but to the shady guy to coffee bean who's promising you and a picture deal

at Paramount and over here just left in the middle is the probability that

you're still a barista forever the most likely stuff lives in the middle as we

slide toward either end things get less and less likely so why is it called a

distribution well because the potential outcomes ie things like winning a

lottery selling a screenplay yeah they're kind of the same thing or [happy people with money]

meeting someone on tinder whose picture was taken less than ten years and twenty [old man walks into park]

pounds ago carries a range that is the potential outcomes are distributed on a

long line that then gets visually mapped to explain the character or feelings

that describes this set of potentialities well the most common

continuous probability distribution is the normal curve or normal distribution

you may know it better by its somewhat common nickname this hilly looking thing

called the bell curve well the mean located in the middle where the peak is [bell curve analysis]

right there is usually labeled mu which represents a population mean the units

on each side are plus and minus one two and three standard deviations Sigma is

the symbol for a population standard deviation well the normal curve was

developed when researchers started comparing tons of measurements of things

like heights of giraffes or diameters of plastic lids for drink cups or lengths

of well just say it got a little competitive there in the lab turns out

that tons of things both man-made and nature made to end up having a normal

curve shape to their measurements

with heights of women they found that a certain height 5 foot 4 inches showed up [woman on a graph]

more than any other that one height showed up with the greatest probability

Heights taller than 5'4 and height shorter than 5'4 showed up less often

well the farther the height was from 5'4 the less likely it was to occur because

while really tall women in really short women aren't that common and average [short, medium and tall women in a row]

height women are very common when they plotted the heights and their associated

probabilities with thousands of results they got a shape that became as the

normal curve because of the shape of the normal curve 68% of all the possible

data lands between the first tick marks on each side of the mean plus and minus

1 Sigma 95 percent of all possible data lands between the second tick marks on

each side of the mean plus and minus 2 Sigma and ninety-nine point seven

percent of all the possible data lands between the third tick marks on each

side of the mean plus and minus the three Sigma there so think about the

height of women where they would map here and we're going to show you for no

extra charge where they go on one two and three Sigma there yeah those are the [sleeping man falls out of chair]

heights well graphically the empirical rule

shakes out like this in the words of Master Yoda worth memorizing this curve

is well we can use these percentages to determine how much of the possible data

plans between different values on the normal curve so let's say we get curious

and decide to measure the length of every tail of every ring-tailed lemur we [lemurs playing in grass]

come across which on the streets around here in Silicon Valley is actually more [lab technician measuring tail]

than you would think all right well then we plot those tail lengths along with

how often they showed up we'd get a normal curve of tail lengths the mean or

average tail length would be at the peak in the middle meaning that it was the

measurement we got most often well the tick marks on the x-axis would be found

by adding the standard deviation of the tail links to the mean once twice and

thrice and then subtracting the standard deviation from the mean once twice and

thrice about 68% of the lemurs we measured would have tail links between

one sigma negative one sigma there you go

95% of the lemurs we measured would have tail lengths between two sigma and

negative two Sigma there we go ninety-nine point seven percent of the

lemurs we measured would have tail lengths between three sigma there and

negative three segments right all in that area as another example the machine

that makes the lids for drink cups doesn't make them the same size every [drinking lid production line]

time because of variations in the temperature of the plastic and of the

mold and of the quality of the plastic can't because a butterfly flapped its [butterfly on flower]

wings in Jamaica the machine will produce lids that are usually around a

targeted diameter but also slightly large or slightly smaller in fact the

diameter of plastic lids for a certain size of drink cups are known to be

normally distributed those lids have a mean diameter of MU equals 3.8 one to

five inches and a standard deviation of Sigma equals point zero five one inches

well this means that we can create a normal curve with actual numbers on the

x axis the mean value in the middle will be the three point eight one to five

inches will then add point zero five one inches once twice and three times a lady

to the mean to get the values on the right and subtract point O five one from

three point eight one two five three times to get the value on the left only

lids in a range of the sweetspot diameters will fit tightly on the cup

this sweet spot ranges between three point seven 105 inches and three point

nine one four five inches well what percentage of lids will be between three

point seven to 105 inches and three point nine one four five inches in

diameter and therefore unusable while we're trying to find the percentage of

lids that will be produced that are between those values at negative two

sigma 3 point 7 5 and two sigma three point nine one four five yeah well

according to the empirical rule ninety-five percent of the data lies

between these two values an empirical rule that's the Empire rule [hand places lid on cup successfully]

the rule of the trying things out and seeing what happens so that's what the

data is telling us well 95 percent of the lids produced on

this machine will be in the sweetspot range and fit tightly on the cups the [surfer holding up]

empirical rule isn't the only game in town when it comes to normal curving but

we'll save the other ways to play on the normal curve for a separate video where [monopoly game]

the normal curve gets the spotlight all to itself well there are other kinds of

probability distributions that don't cover every possible number decimal

infraction they're called discrete probability distributions they usually

hang out in tables and sometimes in formulas turning 18 is great you can

vote you can be drafted you can buy lottery tickets one quick scratch off

and you could be on easy street right and maybe not so easy grab a magnifying

glass peep at the backside of the lottery ticket yep there's a probability

tribution on the back it shows all the prizes you could win it also shows the

probabilities or likelihood that you win those prizes and it's a total downer so [woman disappointed with numbers]

maybe you should ignore it and just scratch and pray you want that new

jacuzzi with shiatsu massaging jets and it ain't cheap right well specifically [fancy jacuzzi]

this is a discrete probability distribution which just means we have a

fixed number of outcomes in this case there are six possible outcomes we can

win five different dollar amounts and we can also win zilch well check out the

probability of winning $0 happens 78% of the time you get nothing and then

there's a one in two thousand or 0.05 percent chance of winning $100 you know

it would have been better if grandma had just given us the money she used to buy

the ticket instead of the tickets themselves yeah there are a few other

kinds of discrete probability distributions here all of them can be

placed in tables if we want to one specifically has a swaggy formula that

helps us generate the probabilities for each possible outcome and it's known as

the binomial probability distribution or BPD for short all to see this thing in

action we need to have a situation where there are only two things that can

happen we'll call winning any kind of moolah on [woman happy with being given money]

that lottery ticket a success we'll call ending up with squat failure well the

BPD requires exactly two possible outcomes if there are more than two we

can't use the BPD the BPD also requires that the chance of a success always

stays the same if there's a twenty two percent chance of winning on the first

ticket of that kind well there needs to be a twenty two percent chance of [stack of scratch and win tickets]

winning for all the same kinds of tickets right so it can't be like you're

picking cards off a deck and all of a sudden there's one less jack so the odds [card dealer fanning cards]

change every card well the BPD also requires that we don't just spend the

rest of our lives scratching off those tickets we have to pick a set number of

tickets we're gonna scratch and you know stick to one meeting those conditions is

vital if we meet them we can answer questions like if you splurge on ten

tickets how likely is it that you win on five of them if there's a 22 percent

chance of winning on each ticket all right well we'd pop in ten for n 5 for K

and point to 2 for P there all right with all the numbers plugged in we have

this ugly looking equation yeah grab a calculator there but we need to

deal with that combinations thing you know the 10 c5 thing yeah it also has [formulas on screen]

its own formula involving factorials which actually finds all the different

orders of what could happen on those 10 cards like we could go win win loss loss

loss win loss win loss win or maybe we get lost lost lost lost win win win win

lost win yeah well the 10 c5 finds the total

number of possible ways the card combos could shake out for those of you with a

TI graphing calculator like a ti-84 or similar we've got you type in the N and [hand using graphing calculator]

in this case press math go over the PRB menu choose NCR then type in the K it's

5 in this instance and it should look like 10 NCR 5 on your screen and then

hit enter well with our combinations number 252 there safely in hand we can

knock out the rest our answer will tell us how likely it is to win on half of [formulas on screen]

the 10 tickets you buy when there's a 22 percent chance of winning on each one

hikes only a 3.75 percent chance of winning on half those tickets you'd have

been better off spending the money on gas station nachos at least then you'd [woman with nachos]

have a stomachache to remember your money by maybe just toss your money in

the trash and then skip the middleman [woman throws nachos away]