Z-Score

Categories: Metrics, Trading, Education

A z-score is a measure of how far an individual data point in a data set is away from the mean, measured in units of the standard deviation of the data set.

We calculate a z-score by subtracting the mean from the data point and then dividing the result by the standard deviation. Z-scores are a convenient way to compare two numbers that come from different data sets.

Let’s say we have a Red Delicious apple that weighs 5.9 ounces, and a navel orange that weighs 8.9 ounces. We can determine which is relatively larger by finding the z-score for each fruit. Red Delicious apples have a mean weight of 5.3 ounces with a standard deviation of 0.6 ounces. The z-score for our apple will have us subtract the mean (5.3) from the apple weight (5.9), giving us 0.6. We then divide that value by the standard deviation (0.6 0.6) to get a z-score of 1. Navel oranges have a mean weight of 8 ounces with a standard deviation of 1.5 ounces. That gives us (8.9 – 8) 1.5 or 0.9 1.5, which gives us a z-score of 0.6 for the orange. Our apple has a larger z-score and is therefore larger, relative to their respective fruits, than our orange.

Who says you can’t compare apples and oranges?

Related or Semi-related Video

Finance: What are z-scores?0 Views

00:00

Finance Allah Shmoop What are Z scores A Z score

00:08

tells us the distance that a data point is away

00:10

from the mean using units of the standard deviation always

00:15

wanted a career in the high profile field of comparing

00:18

apples and oranges Well all you need is a couple

00:20

of Z scores and all of a sudden those apples

00:23

and oranges don't seem quite so different After all what

00:26

Z scores do best is allow you to take data

00:29

points from two entirely different sets of data like your

00:31

grade in your section of applied psychology of tinder and

00:34

your best friends grade in his section of the same

00:37

course taught by a different instructor and compare them as

00:40

if they came from the same data set to see

00:43

which is objectively larger or smaller Or while maybe you

00:46

want to do that Teo See who did better in

00:49

the course Nothing wrong with a little healthy competition particularly

00:51

when you're trying to figure out your tender score Well

00:53

Z scores have one job and that's to tell us

00:56

how far a single data point is from the mean

00:58

But the measuring stick we use isn't in inches or

01:01

meters or even Egyptian cubits Well we use the standard

01:04

deviation of the data set as the ruler That is

01:07

the standard deviation tells us the standard distance or change

01:11

from the mean or middle of the data set or

01:14

set another way Standard deviations Tell us how far on

01:16

average a data point is from the mean of the

01:19

data set Well using that standard deviation is a standard

01:21

of measure means that will be checking to see how

01:24

far our point of interest is away from the mean

01:27

in comparison to other points in the same data set

01:29

This whole process allows us to compare apples and apples

01:32

inside the same data set so we can compare apples

01:35

and oranges later on or set another way It's about

01:37

how deviant the set of data points is inside of

01:40

whatever collection of data we're doing like a very closely

01:44

aligned tinder set with low Z scores Might be this

01:47

guy in this guy and this guy gather all probably

01:49

models and a desperate one with high standard deviation might

01:53

be this guy And uh this guy and well this

01:56

guy assuming that's a human So that'd be a high

01:58

standard deviation on the scores right Okay So let's say

02:01

we took the average daily high temperatures last six days

02:04

and got now seventy three seventy four seventy five seventy

02:06

five seventy six and seventy seven degrees The mean or

02:09

average of this data set turns out to be seventy

02:12

five degrees Strangely at least one of the days actually

02:15

had a temp of seventy five degrees So seventy five

02:17

is both an individual data point and the mean of

02:20

all the data data points that are both in the

02:22

Davis said and equal to the mean of the data

02:25

set have a Z score of zero Z Scores can

02:28

also take on positive values when the data point of

02:31

interest is larger than the Meanwhile take the day when

02:34

the temp with seventy seven degrees right well that's definitely

02:36

larger than the mean temp of seventy five degrees The

02:39

data point of seventy seven will have a positive Z

02:41

score A negative Z score happens when the point of

02:44

interest is smaller than the mean Like if we picked

02:46

the day when the temple is seventy three degrees we're

02:48

picking a data point smaller than that mean of seventy

02:50

five Any time a data point has a value smaller

02:53

than the mean It will also have a negative Z

02:56

score So let's look at a different scenario We all

02:58

know what perfectly sized pineapple is Yes if asked every

03:01

single person in the world envisions exactly the same size

03:04

Pineapple must be some kind of weird collective consciousness thing

03:07

Let's pretend for the sake of the example that the

03:09

perfect size is the mean size of every pineapple that

03:13

ever existed Well imagine a pineapple It is quite a

03:16

bit smaller than our perfect pineapple The Z score for

03:19

this pineapple size will be negative because its size is

03:22

smaller than the mean size But as we imagine smaller

03:25

and smaller pineapples we get Z scores that gets smaller

03:27

and smaller negative values like from negative one two negative

03:30

tude and negative three and so on The farther we

03:32

get from the mean going left the smaller and smaller

03:35

negative Z scores we get right there More negative Now

03:38

imagine a pineapple larger than the perfectly sized pineapple This

03:41

larger pineapple will have a positive Z score because its

03:45

size is larger than the mean size and as that

03:47

larger pineapple gets larger and larger will see it have

03:50

larger and larger positive Z scores like one two three

03:53

and so on Right So how do we actually calculate

03:55

this mythical Z score Well the formula's pretty simple We

03:59

take the data point X subtract the mean ex bar

04:01

and divide the result by the standard deviation s looks

04:04

like that You and your friend are both taking the

04:05

class physics of quantum neutrino fields but with different instructors

04:09

who use different methods and give different assignments but cover

04:12

exactly the same material You got eighty seven percent Your

04:15

friend got eighty nine percent on the exam Well things

04:18

are looking grim for you in the eternal battle of

04:20

who's better But how can we really compare scores if

04:23

the teachers used different methods of instruction and assessment Well

04:27

if we calculate a Z score for each of you

04:29

will be able to see how each of you did

04:31

relative to or compared to others in your own class

04:35

You're class had an average of seventy eight point one

04:38

percent with the standard deviation of five point four percent

04:40

Well what would your Z score be then What will

04:43

take your eighty seven and subtract the mean of seventy

04:45

eight point one to get eight point nine Then we'll

04:46

divide that by the standard deviation of five point for

04:49

giving us a Z score of one point six for

04:51

eight you scored one point six four eight standard deviations

04:54

above the class average Good for you Now we'll find

04:58

your friends Z score Her class had an average of

05:00

seventy five point four percent with a standard deviation of

05:03

eight point eight percent Well that's interesting so her average

05:06

was lower but the deviation higher So what's her Z

05:09

score Well her average eighty nine months The class average

05:12

of seventy five point four gives us thirteen point six

05:14

We divide that by the standard deviation of eight point

05:16

eight to get a Z score of one point five

05:18

four five Yeah you actually did better relatively in your

05:23

course compared to your friend because your score in your

05:25

courses farther from the mean in a positive direction where

05:28

larger and larger positive Z scores live You over there

05:31

then her score is from the mean in her course

05:33

While Z scores are literally eveything we should use to

05:36

compare apples and oranges We need a little context for

05:39

what very large or very small Z scores mean well

05:42

Z scores above four and below negative for our pretty

05:45

uncommon These kinds of Z scores generally mean that the

05:47

data is genuinely very large or very small and it

05:51

talks like that compared to the rest of the data

05:54

Z scores also have another use their used to create

05:57

the standard normal distribution which is like any other normal

06:00

day distribution But you know more standard the process of

06:03

creating a Z scores often called standardizing ah score or

06:07

indexing it if we take a previously existing normal distribution

06:10

of heights of adults are lengths of rainbow bass or

06:13

weights of Gummi there covered pretzel rods and calculate the

06:17

Z scores for every data point and plot them while

06:20

we create Then a data set of standardized scores which

06:23

is still normal and shape And it's called the standard

06:25

normal distribution So yeah just remember that Z scores are

06:28

the best way to compare values in two different data

06:31

sets We take the data point subtract the mean from

06:33

it and then divide that difference by the standard deviation

06:35

of the data set positive Z scores indicated data point

06:38

larger than the mean The farther a point is above

06:41

the mean the larger the Z score Negative Z scores

06:44

indicated data point smaller than the mean The farther a

06:47

data point is below the mean the smaller busy score

06:50

a Z score of zero means the data point is

06:52

equal to the mean right is the mean So when

06:55

Mom drops an apple and an orange on the table

06:57

and demands you compare them well you just grab a

06:59

couple of means and standard deviations and calculate yourself some

07:02

good old fashioned low calorie Z scores She'll be so 00:07:05.673 --> [endTime] impressed Mmm Tasty

Find other enlightening terms in Shmoop Finance Genius Bar(f)