ShmoopTube
Where Monty Python meets your 10th grade teacher.
Search Thousands of Shmoop Videos
Finance: What is the standard normal distribution? 6 Views
Share It!
Description:
What is the standard normal distribution? Standard Normal Distribution refers to statistical data in technical analysis and the level of standard deviation discrepancy from the mathematical mean. A normal distribution’s data is +/- 1 standard deviation 68% and 95% within +/- 2 standard deviations.
Transcript
- 00:00
And finance Allah shmoop What is the standard normal distribution
- 00:08
Senate Normal distribution is the destruction of the Z Scores
- 00:11
of the data points from a normal distribution Okay but
- 00:15
why do we need to create a new normal distribution
- 00:19
like the new normal Isn't that a thing Wasn't the
Full Transcript
- 00:21
normal distribution we already had good enough before We explain
- 00:24
why the standard normal distribution is such a huge improvement
- 00:27
on the plain old normal distribution but we need a
- 00:30
quick recap of the original A normal distribution or normal
- 00:34
curve is a continuous bell shaped distribution that follows the
- 00:37
empirical rule which says that sixty eight percent of the
- 00:40
data is between negative one and one Standard deviations on
- 00:43
either side of the mean ninety five percent of the
- 00:45
data is between negative two and two Standard deviations on
- 00:48
either side of the mean and ninety nine point seven
- 00:51
percent of the data is between negative three and three
- 00:53
Standard deviations on either side of the mean well the
- 00:56
regular normal curve has its peak located at the mean
- 00:59
Ex Bar and is marked off in units of the
- 01:01
standard deviation s right there That's what it looks like
- 01:04
Adding the standard deviation over and over to the right
- 01:06
and subtracting the standard deviation over and over to the
- 01:09
left But what makes it normal The fact that sixty
- 01:12
eight percent of all the data is between one standard
- 01:14
deviation on each side of the means that makes it
- 01:17
normal It's that sixty eight percent truism that makes it
- 01:20
a normal distribution Then ninety five percent of the data
- 01:23
is between two standard deviations on either side of the
- 01:25
mean That's another test for normalcy And ninety nine point
- 01:28
seven percent of the data is between the three Senate
- 01:30
aviation's on either side Another test That's a third test
- 01:33
You passed all three your normal well tons of things
- 01:35
in nature and from manufacturing and lots of other scenarios
- 01:38
are normally distributed like heights of adult males or weights
- 01:42
of snicker bars or the diameter of drink cup lids
- 01:46
or eleventy million other things Okay fun size Snickers have
- 01:50
a mean weight of twenty point Oh five grams of
- 01:52
the standard deviation of point seven two grams and the
- 01:55
weights are normally distributed What that gives us this distribution
- 01:58
of fun size Snickers Wait it's the height of the
- 02:00
graph At any point it's the likelihood of us getting
- 02:02
a candy bar of that specific weight dire the curve
- 02:04
at a point the greater the chance we get the
- 02:06
exact weight This means that the fun size snickers wait
- 02:09
we'll get the most often is that twenty point Oh
- 02:12
five grams size that is smack dab in the middle
- 02:14
Right there waits larger and smaller than that will be
- 02:17
less common in our Halloween candy haul Waits like seventeen
- 02:21
point eight nine grams are twenty two point two one
- 02:24
grams will be extremely rare because there's shofar from the
- 02:27
middle and are at a part of the curve where
- 02:29
we have a very small likelihood of getting those weights
- 02:32
So why should we even mess with the normal distribution
- 02:34
we already have by calculating Z scores to create a
- 02:37
standard normal distribution And well what the heck is a
- 02:39
Z score Anyway We'll answer the first question in just
- 02:42
a sec but a Z scores of value we calculate
- 02:45
that tells us exactly how far a specific data point
- 02:48
is from the mean measured in units of standard deviation
- 02:51
Z scores were a way to get an idea for
- 02:53
how larger small a data point is compared to all
- 02:56
the other data points in the distribution It's like getting
- 02:59
a measure of how fast a Formula One racecar is
- 03:02
compared not to regular beaters on the road but two
- 03:05
other Formula One race cars the Formula One cars obviously
- 03:08
faster than the Shmoop mobile here But is it faster
- 03:12
than other Formula One cars That's what really matters A
- 03:15
Z score will tell us effectively where that one Formula
- 03:18
One car ranks compared to all the other ones we
- 03:20
can speed test If it's got a large positive Z
- 03:23
score it's faster than many if not most of the
- 03:26
cars It has a Z score close to zero Well
- 03:28
then it's right in the middle The pack speed wise
- 03:30
If it's got a small negative Z score well it's
- 03:32
the turtle to the other cars Hairs Why would we
- 03:35
plot the Z scores instead of the scores themselves Well
- 03:38
because the process of standardizing or calculating the plotting of
- 03:41
the Z scores of the data points makes any work
- 03:44
we need to do with the distribution about ten thousand
- 03:46
times easier When we calculated plot the Z scores we
- 03:50
create a distribution that doesn't care anything about the context
- 03:53
of the problem or about the individual means or standard
- 03:56
deviations or whatever Effectively we create one single distribution that
- 04:01
works equally well for heights of people or weights of
- 04:04
candy bars or diameters of drink lids or lengths of
- 04:08
ring tailed Leamer taels If we don't standardize by working
- 04:12
with Z scores we must create a normal curve that
- 04:14
has different numbers for each different scenario And we have
- 04:17
to do new calculations for each scenario for each different
- 04:21
set of values So let's explore the important features of
- 04:24
the standard normal distribution and how it differs from all
- 04:27
the other regular normal distributions The standard normal curve and
- 04:31
the regular normal curve look identical in shape They just
- 04:36
differ in how the X axis this thing right here
- 04:38
is divided Let's walk through an example where we compare
- 04:41
how the normal distribution of the actual data and the
- 04:43
standard normal distribution for the sea Scores of the data
- 04:46
are created at the same time Okay What are we
- 04:48
gonna pick here Well let's pick narwhal tusks They're very
- 04:52
close to normal in their distribution with a mean length
- 04:55
of two point seven five meters and standard deviation of
- 04:57
point to three meters The regular normal distribution of Narwhal
- 05:01
Tusk links are narwhal distribution is that I think we'll
- 05:05
have the peak located above the mean of two point
- 05:07
seven five meters We'll need the Z score of a
- 05:09
data point representing the length of two point seven five
- 05:12
to start labeling the standard normal distribution the same way
- 05:15
we'll Z scores were found by subtracting the mean from
- 05:18
a data point and dividing that value by the standard
- 05:20
deviation of the data To find a Z score we
- 05:23
subtract the mean two point seven five from our data
- 05:25
point also two point seven five to get zero And
- 05:28
then we divide that by the standard deviation of point
- 05:30
two three while we get a Z score for that
- 05:32
middle value of zero Here's the same normal curve of
- 05:35
the Tusk clanks paired with the standard normal curve of
- 05:38
the Z scores Now for the tick marks on the
- 05:40
straight up Tusk link distribution Right there we add the
- 05:43
standard deviation of point two three three times to the
- 05:46
mean of two point seven five to get the tick
- 05:49
marks to the right of the meanwhile we just get
- 05:51
was that two point nine eight and then three point
- 05:53
two ones were adding point to three to it And
- 05:55
then another point that gets us three point four four
- 05:57
There we go and we repeat that procedure on the
- 06:00
left but subtracted three times So we get to point
- 06:02
five to two point two nine And then what is
- 06:05
that two point Oh six on the left Well to
- 06:07
get these same values on our standard normal curve we
- 06:10
need to find some more Z scores The first score
- 06:13
of the right of the mean is that a value
- 06:14
two point nine eight meters It Z score will be
- 06:16
found by taking two point nine eight and subtracting the
- 06:19
mean of two point seven five to get that point
- 06:20
to three and then dividing that by the standard deviation
- 06:23
of point two three while we get one See that's
- 06:25
kind of a little mini proof there The second take
- 06:28
mark to the right will be for data points at
- 06:30
three point two one meters Well when we subtract the
- 06:32
mean we get point four six which we divide by
- 06:35
point two three and get Z equals two and the
- 06:37
third take mark their works out similarly gets a C
- 06:40
equals three See there it is Things will work out
- 06:42
similarly but negatively on the other side on the laughed
- 06:44
when we do the same thing for tick marks Negative
- 06:47
one negative too And then there we go Negative three
- 06:50
Well let's look at the two curves together One is
- 06:52
specific to the data of narwhal Tusk flanks while the
- 06:55
other is standardized to represent the perfect normal curve usable
- 06:59
for all normal data regardless of context or the values
- 07:02
of the means or standard deviations So after standardizing does
- 07:07
the standard normal curve follow the empirical rule Yeah it's
- 07:11
a normal curve After all it's even in the name
- 07:14
standard normal curve See they kind of tipped me off
- 07:17
to those things They're still sixty eight percent of data
- 07:19
points between Negative one and one on the standard normal
- 07:21
curve There's still ninety five percent of the data pretty
- 07:23
negative two and two on the standard normal curve And
- 07:26
there's still ninety nine point seven ten of the day
- 07:27
to pretty negative three and three on standard normal curve
- 07:30
so getting back to the ten thousand times easier thing
- 07:33
Well it comes in when we try to answer questions
- 07:36
like how many of the gummy coded pretzel logs weigh
- 07:40
between twelve and fifteen grams So here's the set up
- 07:43
Gummy coated pretzel log weights are normally distributed with a
- 07:47
mean of thirteen point two grams and a Sarah deviation
- 07:50
of point seven eight grams We want to know what
- 07:52
percentage of pretzel logs that come out of the gummy
- 07:55
bear coding machine way between twelve and fifteen grams which
- 07:58
the company considers their ideal weight range and likely that
- 08:01
customers wouldn't complain and send them back for being too
- 08:04
little or too big If we don't standardize things by
- 08:06
finding the Z scores of our boundary values of twelve
- 08:09
and fifteen grand we'll need some kind of technology to
- 08:11
interpret our mean standard deviation and boundary values in terms
- 08:15
of the normal curve specific to this situation If we
- 08:17
change anything about the problem like the boundary values or
- 08:21
mean or standard deviation well then we'll have to re
- 08:24
input all the new data and start completely over And
- 08:27
that would suck On the other hand since we know
- 08:29
that data are already normally distributed While we can simply
- 08:33
standardize the two boundary values by calculating their Z scores
- 08:36
and use the majesty of the Z table this thing
- 08:39
to answer our questions which is a table telling us
- 08:42
what percentage of data lies to the left or right
- 08:45
of an easy score across the whole standard normal distribution
- 08:49
Many lives were lost and billions of dollars were spent
- 08:52
Teo build this thing so you know you gotta respect
- 08:54
it not to put too fine a point on it
- 08:56
but if we don't standardize dizzy scores we need to
- 08:58
use a unique normal curve and unique calculations every single
- 09:02
time we work with those situations But if we do
- 09:05
standardized to Z scores we just need to check the
- 09:07
one table for every situation It's like choosing to go
- 09:10
to a different store every time we need a different
- 09:13
product or going toe one store that has all of
- 09:15
them in one place like you'd rather go to Safeway
- 09:18
than just the broccoli store and then the egg store
- 09:21
and then the milk store right So let's calculate our
- 09:23
two Z scores for our boundary values and then check
- 09:26
the Z Table to get our percentage of pretzel logs
- 09:28
in the sweet spot that twelve to fifteen range thing
- 09:31
What will take first data point twelve and subtract the
- 09:33
mean weight of thirteen point to giving us negative one
- 09:36
point two grams and then divide that by the standard
- 09:38
deviation of point seven eight which gives us a Z
- 09:40
score there of negative one point five three eight Then
- 09:42
we'll take the second data point fifteen subtract that mean
- 09:45
of thirteen point two to get one point eight then
- 09:47
divide that value by our standard deviation of point seven
- 09:50
eight to get his E score of two point three
- 09:51
eight Well there are two different kinds of ze tables
- 09:54
One shows the area to the left of a specific
- 09:57
Z score The other shows the area to the right
- 10:00
They both give the same info just so we'll use
- 10:03
a left ze table A Siri's of Z scores accurate
- 10:07
to the tense place runs down the left hand side
- 10:09
and the hundreds place for each of those e scores
- 10:11
runs across the top Well the percentage of data to
- 10:14
the left of a specific Z score can be found
- 10:16
at the intersection of a row and a column bullied
- 10:18
around both our Z scores to the hundreds Place negative
- 10:21
one point five four and then two point three one
- 10:24
respectively in order to locate a percentage of data to
- 10:27
the left of each one Well we'll go down to
- 10:29
the negative one point five row then across to the
- 10:32
column here headed by the negative zero point zero four
- 10:35
where negative one point five Avenue intersects with negative zero
- 10:38
point zero four street and we find a percentage of
- 10:41
data to the left of Z equals negative one point
- 10:44
five four of zero point zero six one seven eight
- 10:48
This thing Well well then head way down to the
- 10:51
two point three boulevard then across to the point zero
- 10:53
one road they cross at point nine eight nine five
- 10:57
six So now what What do we do with these
- 10:59
Two percentage is well glad you asked We know the
- 11:01
percentage of data to the left of our fifteen grand
- 11:03
upper boundary Which is that a Z score of two
- 11:06
point three one We also know the area to the
- 11:08
left of our twelve Graham lower boundary at a Z
- 11:10
score of negative one point five four announced time to
- 11:13
merge those two areas Check the area to the left
- 11:16
of the Z score of two point three one on
- 11:18
the standard normal curve This is the percentage of data
- 11:20
to the left of that value Now check the area
- 11:23
to the left of it Z score of negative one
- 11:25
point five four on the same standard normal curve Well
- 11:28
this is the percentage of data to the left of
- 11:30
that value If we cut away the area to the
- 11:32
left of Z equals negative one point five four or
- 11:35
left with the area here between Z equals negative one
- 11:38
point five for ends e equals two point three one
- 11:40
This is the percentage of data between these two values
- 11:44
and you're looking at this really heavily to be sure
- 11:46
that you got enough in that general sweet spot range
- 11:49
They don't get a whole lot of returns from angry
- 11:50
customers Well we just need to subtract the point Oh
- 11:53
six one seven eight from the point nine eight nine
- 11:55
five six to get the percentage of data between those
- 11:57
two values which is yes about ninety three percent so
- 12:01
What does that mean Well that means ninety three percent
- 12:03
of the gum encoded pretzel logs produced will be between
- 12:06
twelve and fifteen grams in weight And that's either good
- 12:08
news or not Well a couple of important safety tips
- 12:12
though Before you all head out to the store for
- 12:14
some more gumming coded pretzel log We should on Lee
- 12:16
try to standardize I'ii do things with Z scores if
- 12:19
the data are normal in shape to begin with If
- 12:22
they're not the data Maki nations here will be useless
- 12:24
to you Make sure you're paying attention to what kind
- 12:26
of ze table you have again Some show areas to
- 12:29
the left while others give areas to the right and
- 12:32
specific Z scores Every time you've got a set of
- 12:35
normally distributed data you should standardize the situation by finding
- 12:39
Z scores And while you'll save yourself a ton of
- 12:42
work in the long run what least tons of stats
- 12:44
work if we can't help you Sorry I do
Up Next
GED Social Studies 1.1 Civics and Government
Related Videos
What is bankruptcy? Deadbeats who can't pay their bills declare bankruptcy. Either they borrowed too much money, or the business fell apart. They t...
What's a dividend? At will, the board of directors can pay a dividend on common stock. Usually, that payout is some percentage less than 100 of ear...
How are risk and reward related? Take more risk, expect more reward. A lottery ticket might be worth a billion dollars, but if the odds are one in...