meascent.pdf

Measure of Central Tendency

© 2019 Laureate Education, Inc.

1

Measures of Central Tendency Program Transcript

STACY BJORKMAN: With the basic vocabulary of the statistical language and the ability to organize sets of scores into tables and graphs under your belt, you are ready to begin computing and interpreting some basic descriptive statistics. Remember that descriptive statistics are those that summarize a data set. We will first focus on measures of central tendency or the statistics used to describe where the center of a data et is located.

Essentially, measures of central tendency tell us how the scores hang together in the sample. What scores are typical for the data set? The measures of central tendency we will learn about here all summarize larger data sets or give us a snapshot picture of where scores tend to fall. That is-- central tendency describes where the center of the distribution tends to be located.

These measures allow us to get a sense of a lot of scores without looking at each individual score. They summarize the data for us. There are three common measures of central tendency, mode, median, and mean. We'll start with the mode.

The mode is the score that occurs the most frequently in the data set. So if we have this data set of the number of times 10 people dine out in a week, 1 would be the mode because it occurs five times more than any other number. More people dine out one time per week than any other frequency.

If the data set has one mode, we call it unimodal. Its distribution would have one hump. If a data set has two modes-- that is if two scores are tied with the most number of scores-- we call it bimodal. Its distribution would have two humps. Press pause on this tutorial, and calculate the mode of this data set.

This data set has two modes-- 4 and 5-- because they both occurred three times more than any other value. The mode is a crude measure of central tendency because it only tells you about the most frequent score or scores. All the other scores are ignored. The mode does have a strength, though. It is the only measure of central tendency that can be used to summarize nominal data.

The median and mean require quantitative data. But the mode can be used on qualitative variables. The median is the score in the very middle of the data set when the numbers are put in numerical order. Half of the scores in the data set fall above the median, and half fall below it. That is-- the median falls at the 50th percentile.

If we put the dining behavior data set with the mode of 1 in numerical order, we have this. There is an even number of cases in this data set, so we have two

Measure of Central Tendency

© 2019 Laureate Education, Inc.

2

middle numbers. To find the median, we add those two values together and divide by 2. Essentially, we find the midpoint between those two values.

Here, the median would be 1 plus 2, which equals 3. 3 divided by 2 is 1.5. This tells us that half of the people in the sample dine out less than 1 and 1/2 times per week. And half of the people dine out more than that. Press pause on this tutorial, and calculate the median of this data set.

The median is 4 because it falls in the center of the data set. A data set can have only one median, unlike the mode. But it still ignores a lot of values, which is one disadvantage. However, this isn't always a bad thing. We'll take a look at why in a minute. The mean is the most common measure of central tendency. It is what we typically think of as the average. To find the mean, you add up all the scores in the data set and divide by the number of scores in the data set.

For our dining out sample, the mean is 1 plus 1 plus 1 plus 1 plus 1 plus 2 plus 3 plus 4 plus 5 plus 19, which equals 38. 38 divided by 10 is 3.8. This tells us that on average, people in the sample dine out 3.8 times per week. The major advantage of the mean is that it takes all values into account, unlike the mode and median. However, it is important to know that taking all scores into consideration is not always the best snapshot of a distribution.

The example here is the perfect example of when the mean may not be the best measure of central tendency to summarize a data set. The mean is highly influenced by outliers, or scores that are far higher or lower than most of the other scores. The data point of 19 is an outlier in this data set. It is much higher than the other numbers. Including 19 in the calculation of the mean makes it appear that people dine out more frequently on average than they really do.

Let's calculate the mean again without the value of 19. The sum of this data set is now 19, and there are nine numbers in the data set. 19 divided by 9 equals 2.1. The mean without 19 included is 2.1 lower than the mean of 3.8 with 19 included. Press pause on this tutorial, and calculate the mean of this data set.

The mean is 4.09. We try to use the mean as often as possible. But the median is the preferred measure of central tendency when distributions have outliers or are skewed. To illustrate this, let's look at some distributions. We see that in a normal distribution, the mean, median, and mode are all the same. When a distribution is positively skewed, the mode and median are less than the mean. When a distribution is negatively skewed, the mode and median are greater than the mean.

The mean is a very useful descriptive statistic. But it is also the cornerstone of just about every inferential statistic out there. Let's take the example of comparing two drugs' abilities to lower blood pressure. To conduct such a test,

Measure of Central Tendency

© 2019 Laureate Education, Inc.

3

we compare the mean blood pressure scores of people taking Drug A and people taking Drug B.

If our inferential statistics tell us that there is a significant difference between the mean blood pressure readings of the people taking the two drugs, then we know which drug is significantly better at treating high blood pressure-- the drug whose participants have the lower mean blood pressure score. There is a relationship between the type of drug taken and blood pressure. That's the story that the statistics told us.

Remember to use caution when interpreting the results of descriptive statistics. They only summarize data, so they can not tell us why anything is the way it is. They can only describe what we see. It is critical to explain what information we learn from each measure in addition to stating the value itself. One way to do this is to always state, which means that, after you present each statistical value.

For example, if we're discussing the test for understanding scores for Week 2, we might say the mode of the sample is 87%, which means that more students earned a score of 87% on the Week 2 test of understanding than any other score. We cannot say, if the measures of central tendency are good or bad. Nor can we make any assumptions about why the data are the way they are.