Whenever we collect data, there's a collection of possible values from which we record our observations. If we're flipping a coin, the possible values we can observe are H (heads) or T (tails). Or, occasionally, the very rare E (edge). If we're measuring someone's height in centimeters, the possible values are any positive number of centimeters and fractions thereof. There are two different ways to classify data based on the possible values we can observe.

Data is **discrete** if there is clear separation between the different possible values. Either there will be a finite number of possible values, or we're counting something.

If we flip a coin and record the result there are only two possible values (ignoring that pesky "edge" thing), H and T, so our observations are discrete.

Recording the numbers of coins in different piggy banks would also give us discrete data, since there's a separation of one whole coin between any two numbers we might get. Even a half-dollar is still a whole-coin.

Sets of data that record counts of things are discrete.

However, data is **continuous** if there's no clear separation between possible values. Like if two values are still kinda-sorta seeing each other, but haven't really discussed if they're an "item."

If we measure someone's height in centimeters we could get 160 cm, or 160.01 cm, or 160.001 cm (assuming we had a very accurate method of measurement). For any two possible values (say, 160 cm and 161 cm), there's another possible value between them (160.5 cm). Those infuriating numbers can always be broken down into smaller and smaller numbers. It's part of the reason we love them so much. Can't count with them, can't count without them.

Sets of data involving measurements that can have fractions or decimals are generally continuous.

Next Page: Univariate v. Bivariate Data

Previous Page: Categorical Data