Data Analysis
Business Statistics: A Decision-Making Approach
Tenth Edition
Chapter 3
Describing Data Using Numerical Measures
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
If this PowerPoint presentation contains mathematical equations, you may need to check that your computer has the following installed:
1) MathType Plugin
2) Math Player (free versions available)
3) NVDA Reader (free versions available)
1
Objectives
Compute the mean, median, mode, and weighted mean for a set of data and understand what these values represent.
Construct a box and whisker graph and interpret it.
Compute the range, interquartile range, variance, and standard deviation and know what these values mean.
Compute a z score and the coefficient of variation and understand how they are applied in decision-making situations.
Understand the Empirical Rule and Tchebysheff’s Theorem.
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Section 3.1 Measures of Center and Location
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Parameter and Statistic
Parameter
A measure computed from the entire population
As long as the population does not change, the value of the parameter will not change.
Statistic
A measure computed from a sample that has been selected from a population
The value of the statistic will depend on which sample is selected.
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Describing Data – What to Look For
Measuring the Center
Data for a variable of interest forms a distribution
We need to be able to describe the center using a numerical measure.
Measuring the Spread
Values for a variable of interest will take on different values –the data are spread out around the center.
We need to be able to measure the spread.
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Data Distributions with Different Centers
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Data Distributions with Different Spreads
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Descriptive Measures of the Center
Mean
The arithmetic average of the data
Median
The midpoint of the data
Mode
The data value occurring most frequently
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Population Mean (Parameter)
µ - Population mean (pronounced mu)
N - Population size
individual value of variable x
The average for all values in the population computed by dividing the sum of all values by the population size
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Population Mean - Example
Each day, a local hospital counts the number of patients that are in the hospital. This is called the hospital census. The hospital is interested in only the census data for the past 10 days. This is the population of interest. The data are shown below:
| 216 | 255 | 330 | 254 | 348 |
| 317 | 292 | 267 | 310 | 295 |
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
10
The Population Mean - Graphic Representation
Hospital Census – Population Data
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Population Mean – Example (1 of 2)
Table 3.1 San Carlo Hotel Data
N = 8 Weeks
| Week | Rooms Rented | Revenue | Complaints |
| 1 | 22 | $1,870 | 0 |
| 2 | 13 | $1,590 | 2 |
| 3 | 10 | $1,760 | 1 |
| 4 | 16 | $2,345 | 0 |
| 5 | 23 | $4,563 | 2 |
| 6 | 13 | $1,630 | 1 |
| 7 | 11 | $2,156 | 0 |
| 8 | 13 | $1,756 | 0 |
= Total number of rooms rented
= Total dollar revenue from the room rentals
= number of customer complaints that came from guests each Sunday
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Population Mean – Example (2 of 2)
| Week | Rooms Rented |
| 1 | 22 |
| 2 | 13 |
| 3 | 10 |
| 4 | 16 |
| 5 | 23 |
| 6 | 13 |
| 7 | 11 |
| 8 | 13 |
Calculate the Population Mean for the Variable, Rooms Rented
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Mean – The Balance Point
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Property of the Mean
Table 3.2 Deviations Around the Mean Using Hotel Data
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Sample Mean (Statistic)
- Sample mean (x-Bar)
n - Sample size
The average for all values in the sample computed by dividing the sum of all sample values by the sample size
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Sample Mean - Example
Number of Sales Made By a Sales Representative
n = 10 sample of ten days during the past three years
| 20 | 5 | 10 | 40 | 50 |
| 15 | 30 | 20 | 20 | 90 |
Raw Data
X-Bar
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Sample Mean – The Balance Point
Sales Example
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Central Property of the Mean
Sales Example
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
When Extreme Values Are Present Impact on the Mean -Sales Example
Original Data
New Data (one value changed)
Conclusion: The Mean is Sensitive to Extreme Values.
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Impact of Extreme Values on the Mean –Accounting Graduates Starting Salaries - Example
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Impact of Extreme Values on the Mean –Accounting Graduates Starting Salaries Example Using Excel 2016
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Characteristics of the Mean
Uses all the Data
Sensitive to Extreme Values
The Balance Point
Sum of Deviations from the Mean Equals Zero
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Median (1 of 2)
The median is a center value that divides a data array into two halves (Md).
Data Array
Data that have been arranged in numerical order
Median Index
i = The index of the point in the data set corresponding to the median value
n = Sample size
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Median (2 of 2)
In an ordered array (lowest to highest), the median is the “middle” number, i.e., the number that splits the distribution in half numerically.
50% of the data is above the median, 50% is below
Represented as Md
The median is not affected by extreme values.
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Computing the Median (1 of 4)
Step 1: Collect the sample data.
| Starting Salaries |
| $44,000 |
| $52,000 |
| $39,000 |
| $56,000 |
| $61,000 |
| $46,000 |
| $60,000 |
n = 7
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Computing the Median (2 of 4)
Step 2: Sort data from smallest to largest.
| Starting Salaries |
| $39,000 |
| $44,000 |
| $46,000 |
| $52,000 |
| $56,000 |
| $60,000 |
| $61,000 |
n = 7
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Computing the Median (3 of 4)
Step 3: Calculate the median index.
If i is not an integer, round up to next highest integer.
If i is an integer, the median is the average of the values in position i and i + 1.
The Median is the 4th value from top or bottom of the data.
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
28
Computing the Median (4 of 4)
Step 4: Find the median.
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Median Example - Even Number of Data
| 20 | 5 | 10 | 40 | 50 |
| 15 | 30 | 20 | 20 | 90 |
Sales Data
Number of Sales Made n = 10
Sort the Data
When there are an even number of data points, the median is the average of the middle two values.
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Median - Impact of Extreme Values Sales Data Example
Extreme
Value
The Median is Unaffected by Extreme Values.
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Median - Impact of Extreme Values Accounting Graduates Starting Salaries
The only reason that the two medians are not equal is that the number of data values is different – Extreme salary did not affect the median.
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Comparing the Mean and the Median
When extreme values are present, the mean gets pulled above the median.
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
33
Characteristics of the Median
Not Sensitive to Extreme Values
Uses only the Middle Value(s)
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Mode (1 of 2)
Mode: The value in the data which appears most frequently
Sales Data
| 20 | 5 | 10 | 40 | 50 |
| 15 | 30 | 20 | 20 | 90 |
Mode = 20 (occurs 3 times)
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
The Mode
Number of Students Absent from Class Per Day
Sample Data
The Mode can be a weak measure of the center.
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Mode: Example N B A Player Weights
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Mode: Example N B A Player Weights Using Excel 2016
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Mode (2 of 2)
The value in a data set that occurs most frequently
Is not affected by extreme values.
Can be used for both quantitative and qualitative data (ratio, interval, ordinal, nominal.)
Can have more than one mode, or no mode.
Distribution with two modes - bimodal
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Distribution Shapes Relationship to Mean, Median and Mode (1 of 2)
Symmetrical Distribution
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Distribution Shapes Relationship to Mean, Median and Mode (2 of 2)
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Weighted Mean
The mean value of data values that have been weighted according to their relative importance
Weighted Mean for a Population
Weighted Mean for a Sample
- The weight of the ith data value
- The ith data value
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Weighted Mean Example: Financial Portfolio
| Mutual Fund | Percentage Retrun | Shares |
| Blank | x | w |
| Fidelity | 7.2 | 2,000 |
| Vanguard | 8.3 | 5,000 |
| Dimensional | 5.4 | 12,000 |
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Percentiles and Quartiles
Percentiles
The pth percentile in a data array:
p% are less than or equal to this value.
(100 − p)% are greater than or equal to this value.
50th percentile is the median.
Quartiles
1st quartile = 25th percentile
2nd quartile = 50th percentile
Also the median
3rd quartile = 75th percentile
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Calculating Percentiles
Step 1: Sort the data in order from the lowest to highest value.
Step 2: Determine the percentile location index:
Step 3: If i is not an integer, then round to next highest integer. The pth percentile is located at the rounded index position. If i is an integer, the pth percentile is the average of the values at location index positions i and i + 1.
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Calculating Percentiles Example (1 of 2)
Moving Company Travel Distances - Miles
| 13.5 | 8.6 | 16.2 | 21.4 | 21.0 | 23.7 | 4.1 | 13.8 | 20.5 | 9.6 |
| 11.5 | 6.5 | 5.8 | 10.1 | 11.1 | 4.4 | 12.2 | 13.0 | 15.7 | 13.2 |
| 13.4 | 13.1 | 21.7 | 14.6 | 14.1 | 12.4 | 24.9 | 19.3 | 26.9 | 11.7 |
Determine the 80th Percentile.
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Calculating Percentiles Example (2 of 2)
Step 1 Sort the data from lowest to highest.
Step 2 Determine percentile location index, i, using Equation 3.6. The 80th percentile location index is
80th percentile is the average of the 24th and 25th values.
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Calculating Percentiles Using Excel 2016 - NBA Player Weights Example
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Calculating Quartiles
Find the 1st quartile in an ordered array of 19 values.
Quartile location index:
Use value at 5th position.
1st quartile
equals 51.
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Calculating Quartiles Using Excel 2016 – N B A Player Weights Example
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Box and Whisker Plot
A graph that is composed of two parts: a box and the whiskers
The box has a width that ranges from the first quartile
to the third quartile
A vertical line through the box is placed at the median.
Limits are located at a value that is 1.5 multiplied by the
difference between
The whiskers extend to the left to the lowest value within the limits and to the right to the highest value within the limits.
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Constructing a Box and Whisker Plot (1 of 2)
Step 1: Sort values from lowest to highest.
Step 2: Find
Step 3: Draw the box so that the ends are at
Step 4: Draw a vertical line through the median.
Step 5: Calculate the interquartile range
Step 6: Extend dashed lines from each end to the highest and lowest values within the limits.
Step 7: Identify outliers with an asterisk (*).
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Constructing a Box and Whisker Plot (2 of 2)
The center box extends from
The line within the box is the median.
The whiskers extend to the smallest and largest values within the calculated limits.
Outliers are plotted outside the calculated limits.
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Box and Whisker Plot – Example Miles Between Fill-Ups
| 231 | 236 | 241 | 242 | 242 | 243 | 243 | 243 | 248 |
| 248 | 249 | 250 | 251 | 251 | 252 | 252 | 254 | 255 |
| 255 | 256 | 256 | 257 | 259 | 260 | 260 | 260 | 260 |
| 262 | 262 | 264 | 265 | 265 | 265 | 266 | 268 | 268 |
| 270 | 276 | 277 | 277 | 280 | 286 | 300 | 324 | 345 |
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Box and Whisker Plot – Example A T M Usage by Location
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Data Level Issues
Be aware of the level of data before computing a numerical measure.
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
The Mean - Data Level Requirements
Ratio or Interval Level Data
Variables Such As:
Age
Income
Interest Rate
Stock Prices
Sales
Number of Defects
Temperature
etc.
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
The Mean – Caution with Ordinal Data (1 of 2)
Computing a Mean for Ordinal Data – Not Recommended
Example: Ordinal Data − Education Level
1 = Grade School
2 = High School
3 = Some College
4 = College Degree
5 = Graduate Degree
Sample of n = 5 people − Education Codes
Does this imply that the mean education level in the sample is somewhere between High School and Some College?
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
The Mean – Caution with Ordinal Data (2 of 2)
Example: 5 Point Scale Question
I believe that I spend enough time studying for my statistics class.
Responses
| 2 |
| 4 |
| 3 |
| 5 |
| 2 |
| 1 |
| 3 |
| 3 |
| 4 |
Concerns:
Is there an equal distance between the 5 categories – is the distance in meaning between S A and A the same as between A and Neutral?
Is a response of A twice as high as a response of D?
The mean computation assumes the answer to both questions is “Yes”.
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
The Mean – Definitely Not with Nominal Data
What is your Marital Status?
1 = Single
2 = Married
3 = Divorced
4 = Widowed
Mean = 1.93 Does this mean that on average, a person in the sample is almost married??
Don’t compute the mean for ordinal data!
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
The Median - Data Level Requirements
Ratio or Interval Level Data
Variables Such As:
Age
Income
Interest Rate
Stock Prices
etc.
Okay for Ordinal Data
Not for Nominal Data
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Descriptive Measures - Summary
| Descriptive Measure | Computation Method | Data Level | Advantages/Disadvantages |
| Mean | Sum of values divided by the number of values | Ratio Interval | Numerical center of the data Sum of deviations from the mean is zero Sensitive to extreme values |
| Median | Middle value for data that have been sorted | Ratio Interval Ordinal | Not sensitive to extreme values Computed only from the center values Does not use information from all the data |
| Mode | Value(s) that occur most frequently in the data | Ratio Interval Ordinal Nominal | May not reflect the center May not exist Might have multiple modes |
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Comparing Distributions Production Volumes Example
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Production Distributions What Is the Difference?
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Section 3.2 Measures of Dispersion/Variation
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Descriptive Statistics Measures of Dispersion
Range
Simple range
Interquartile Range
Variance
Standard Deviation
Coefficient of Variation
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Range
A measure of variation that is computed by finding the difference between the maximum and minimum values in a data set
R = Maximum Value − Minimum Value
Simplest measure of variation
Is very sensitive to extreme values
Ignores the data distribution
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Measures of Dispersion - Staff Years of Experience
Raw Data
Values are the number of years experience for a sample of staff members.
| 13 | 16 | 16 | 13 | 13 | 9 |
| 12 | 15 | 8 | 12 | 13 | 14 |
| 11 | 6 | 15 | 11 | 15 | 13 |
| 12 | 12 | 13 | 13 | 13 | 11 |
| 17 | 14 | 11 | 14 | 13 | 15 |
| 11 | 11 | 13 | 15 | 9 | 10 |
| 11 | 12 | 11 | 16 | 11 | 10 |
| 13 | 13 | 11 | 13 | 10 | 13 |
| 11 | 10 | 8 | 11 | 11 | 17 |
| 10 | 12 | 13 | 10 | 10 | 15 |
| 14 | 11 | 9 | 15 | 13 | 12 |
| 11 | 14 | 12 | 10 | 12 | 14 |
| 14 | 12 | 11 | 11 | 12 | 13 |
| 11 | 11 | 11 | 13 | 12 | 11 |
| 12 | 10 | 9 | 12 | 10 | 13 |
| 12 | 14 | 10 | 10 | 16 | 12 |
| 11 | 11 | 14 | 14 | Blank | Blank |
n = 100
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Finding the Range
Years Experience
Range = High − Low
Range = 17 − 6 = 11
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Characteristics of the Range
Easy to Compute
Uses Only Two Values (high and low)
Sensitive to Extremes
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Interquartile Range
A measure of variation that is determined by computing the difference between the third and first quartiles
Interquartile Range
Eliminates outlier problems
Eliminates some high- and low-valued observations
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Interquartile Range Example
Interquartile range = 57 − 30 = 27
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Population Variance
The average of the squared distances of the data values from the mean.
- population mean, N – population size
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Population Variance - Example
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Measures of Dispersion Production Volume Example
Departments the same in terms of the center but differ in degree of dispersion
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Computing the Variance Department A
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Computing the Variance Department B
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›
Comparing the Dispersion Dept. A vs Dept. B
The variance is measured in units squared.
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Slide - ‹#›