Data Analysis

profileEndurance3
cf_MAT510_w3_ch3.pptx

Business Statistics: A Decision-Making Approach

Tenth Edition

Chapter 3

Describing Data Using Numerical Measures

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

If this PowerPoint presentation contains mathematical equations, you may need to check that your computer has the following installed:

1) MathType Plugin

2) Math Player (free versions available)

3) NVDA Reader (free versions available)

1

Objectives

Compute the mean, median, mode, and weighted mean for a set of data and understand what these values represent.

Construct a box and whisker graph and interpret it.

Compute the range, interquartile range, variance, and standard deviation and know what these values mean.

Compute a z score and the coefficient of variation and understand how they are applied in decision-making situations.

Understand the Empirical Rule and Tchebysheff’s Theorem.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Section 3.1 Measures of Center and Location

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Parameter and Statistic

Parameter

A measure computed from the entire population

As long as the population does not change, the value of the parameter will not change.

Statistic

A measure computed from a sample that has been selected from a population

The value of the statistic will depend on which sample is selected.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Describing Data – What to Look For

Measuring the Center

Data for a variable of interest forms a distribution

We need to be able to describe the center using a numerical measure.

Measuring the Spread

Values for a variable of interest will take on different values –the data are spread out around the center.

We need to be able to measure the spread.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Data Distributions with Different Centers

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Data Distributions with Different Spreads

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Descriptive Measures of the Center

Mean

The arithmetic average of the data

Median

The midpoint of the data

Mode

The data value occurring most frequently

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Population Mean (Parameter)

µ - Population mean (pronounced mu)

N - Population size

individual value of variable x

The average for all values in the population computed by dividing the sum of all values by the population size

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Population Mean - Example

Each day, a local hospital counts the number of patients that are in the hospital. This is called the hospital census. The hospital is interested in only the census data for the past 10 days. This is the population of interest. The data are shown below:

216 255 330 254 348
317 292 267 310 295

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

10

The Population Mean - Graphic Representation

Hospital Census – Population Data

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Population Mean – Example (1 of 2)

Table 3.1 San Carlo Hotel Data

N = 8 Weeks

Week Rooms Rented Revenue Complaints
1 22 $1,870 0
2 13 $1,590 2
3 10 $1,760 1
4 16 $2,345 0
5 23 $4,563 2
6 13 $1,630 1
7 11 $2,156 0
8 13 $1,756 0

= Total number of rooms rented

= Total dollar revenue from the room rentals

= number of customer complaints that came from guests each Sunday

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Population Mean – Example (2 of 2)

Week Rooms Rented
1 22
2 13
3 10
4 16
5 23
6 13
7 11
8 13

Calculate the Population Mean for the Variable, Rooms Rented

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Mean – The Balance Point

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Property of the Mean

Table 3.2 Deviations Around the Mean Using Hotel Data

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Sample Mean (Statistic)

- Sample mean (x-Bar)

n - Sample size

The average for all values in the sample computed by dividing the sum of all sample values by the sample size

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Sample Mean - Example

Number of Sales Made By a Sales Representative

n = 10 sample of ten days during the past three years

20 5 10 40 50
15 30 20 20 90

Raw Data

X-Bar

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Sample Mean – The Balance Point

Sales Example

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Central Property of the Mean

Sales Example

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

When Extreme Values Are Present Impact on the Mean -Sales Example

Original Data

New Data (one value changed)

Conclusion: The Mean is Sensitive to Extreme Values.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Impact of Extreme Values on the Mean –Accounting Graduates Starting Salaries - Example

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Impact of Extreme Values on the Mean –Accounting Graduates Starting Salaries Example Using Excel 2016

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Characteristics of the Mean

Uses all the Data

Sensitive to Extreme Values

The Balance Point

Sum of Deviations from the Mean Equals Zero

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Median (1 of 2)

The median is a center value that divides a data array into two halves (Md).

Data Array

Data that have been arranged in numerical order

Median Index

i = The index of the point in the data set corresponding to the median value

n = Sample size

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Median (2 of 2)

In an ordered array (lowest to highest), the median is the “middle” number, i.e., the number that splits the distribution in half numerically.

50% of the data is above the median, 50% is below

Represented as Md

The median is not affected by extreme values.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Computing the Median (1 of 4)

Step 1: Collect the sample data.

Starting Salaries
$44,000
$52,000
$39,000
$56,000
$61,000
$46,000
$60,000

n = 7

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Computing the Median (2 of 4)

Step 2: Sort data from smallest to largest.

Starting Salaries
$39,000
$44,000
$46,000
$52,000
$56,000
$60,000
$61,000

n = 7

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Computing the Median (3 of 4)

Step 3: Calculate the median index.

If i is not an integer, round up to next highest integer.

If i is an integer, the median is the average of the values in position i and i + 1.

The Median is the 4th value from top or bottom of the data.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

28

Computing the Median (4 of 4)

Step 4: Find the median.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Median Example - Even Number of Data

20 5 10 40 50
15 30 20 20 90

Sales Data

Number of Sales Made n = 10

Sort the Data

When there are an even number of data points, the median is the average of the middle two values.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Median - Impact of Extreme Values Sales Data Example

Extreme

Value

The Median is Unaffected by Extreme Values.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Median - Impact of Extreme Values Accounting Graduates Starting Salaries

The only reason that the two medians are not equal is that the number of data values is different – Extreme salary did not affect the median.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Comparing the Mean and the Median

When extreme values are present, the mean gets pulled above the median.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

33

Characteristics of the Median

Not Sensitive to Extreme Values

Uses only the Middle Value(s)

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Mode (1 of 2)

Mode: The value in the data which appears most frequently

Sales Data

20 5 10 40 50
15 30 20 20 90

Mode = 20 (occurs 3 times)

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

The Mode

Number of Students Absent from Class Per Day

Sample Data

The Mode can be a weak measure of the center.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Mode: Example N B A Player Weights

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Mode: Example N B A Player Weights Using Excel 2016

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Mode (2 of 2)

The value in a data set that occurs most frequently

Is not affected by extreme values.

Can be used for both quantitative and qualitative data (ratio, interval, ordinal, nominal.)

Can have more than one mode, or no mode.

Distribution with two modes - bimodal

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Distribution Shapes Relationship to Mean, Median and Mode (1 of 2)

Symmetrical Distribution

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Distribution Shapes Relationship to Mean, Median and Mode (2 of 2)

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Weighted Mean

The mean value of data values that have been weighted according to their relative importance

Weighted Mean for a Population

Weighted Mean for a Sample

- The weight of the ith data value

- The ith data value

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Weighted Mean Example: Financial Portfolio

Mutual Fund Percentage Retrun Shares
Blank x w
Fidelity 7.2 2,000
Vanguard 8.3 5,000
Dimensional 5.4 12,000

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Percentiles and Quartiles

Percentiles

The pth percentile in a data array:

p% are less than or equal to this value.

(100 − p)% are greater than or equal to this value.

50th percentile is the median.

Quartiles

1st quartile = 25th percentile

2nd quartile = 50th percentile

Also the median

3rd quartile = 75th percentile

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Calculating Percentiles

Step 1: Sort the data in order from the lowest to highest value.

Step 2: Determine the percentile location index:

Step 3: If i is not an integer, then round to next highest integer. The pth percentile is located at the rounded index position. If i is an integer, the pth percentile is the average of the values at location index positions i and i + 1.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Calculating Percentiles Example (1 of 2)

Moving Company Travel Distances - Miles

13.5 8.6 16.2 21.4 21.0 23.7 4.1 13.8 20.5 9.6
11.5 6.5 5.8 10.1 11.1 4.4 12.2 13.0 15.7 13.2
13.4 13.1 21.7 14.6 14.1 12.4 24.9 19.3 26.9 11.7

Determine the 80th Percentile.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Calculating Percentiles Example (2 of 2)

Step 1 Sort the data from lowest to highest.

Step 2 Determine percentile location index, i, using Equation 3.6. The 80th percentile location index is

80th percentile is the average of the 24th and 25th values.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Calculating Percentiles Using Excel 2016 - NBA Player Weights Example

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Calculating Quartiles

Find the 1st quartile in an ordered array of 19 values.

Quartile location index:

Use value at 5th position.

1st quartile

equals 51.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Calculating Quartiles Using Excel 2016 – N B A Player Weights Example

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Box and Whisker Plot

A graph that is composed of two parts: a box and the whiskers

The box has a width that ranges from the first quartile

to the third quartile

A vertical line through the box is placed at the median.

Limits are located at a value that is 1.5 multiplied by the

difference between

The whiskers extend to the left to the lowest value within the limits and to the right to the highest value within the limits.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Constructing a Box and Whisker Plot (1 of 2)

Step 1: Sort values from lowest to highest.

Step 2: Find

Step 3: Draw the box so that the ends are at

Step 4: Draw a vertical line through the median.

Step 5: Calculate the interquartile range

Step 6: Extend dashed lines from each end to the highest and lowest values within the limits.

Step 7: Identify outliers with an asterisk (*).

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Constructing a Box and Whisker Plot (2 of 2)

The center box extends from

The line within the box is the median.

The whiskers extend to the smallest and largest values within the calculated limits.

Outliers are plotted outside the calculated limits.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Box and Whisker Plot – Example Miles Between Fill-Ups

231 236 241 242 242 243 243 243 248
248 249 250 251 251 252 252 254 255
255 256 256 257 259 260 260 260 260
262 262 264 265 265 265 266 268 268
270 276 277 277 280 286 300 324 345

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Box and Whisker Plot – Example A T M Usage by Location

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Data Level Issues

Be aware of the level of data before computing a numerical measure.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

The Mean - Data Level Requirements

Ratio or Interval Level Data

Variables Such As:

Age

Income

Interest Rate

Stock Prices

Sales

Number of Defects

Temperature

etc.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

The Mean – Caution with Ordinal Data (1 of 2)

Computing a Mean for Ordinal Data – Not Recommended

Example: Ordinal Data − Education Level

1 = Grade School

2 = High School

3 = Some College

4 = College Degree

5 = Graduate Degree

Sample of n = 5 people − Education Codes

Does this imply that the mean education level in the sample is somewhere between High School and Some College?

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

The Mean – Caution with Ordinal Data (2 of 2)

Example: 5 Point Scale Question

I believe that I spend enough time studying for my statistics class.

Responses

2
4
3
5
2
1
3
3
4

Concerns:

Is there an equal distance between the 5 categories – is the distance in meaning between S A and A the same as between A and Neutral?

Is a response of A twice as high as a response of D?

The mean computation assumes the answer to both questions is “Yes”.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

The Mean – Definitely Not with Nominal Data

What is your Marital Status?

1 = Single

2 = Married

3 = Divorced

4 = Widowed

Mean = 1.93 Does this mean that on average, a person in the sample is almost married??

Don’t compute the mean for ordinal data!

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

The Median - Data Level Requirements

Ratio or Interval Level Data

Variables Such As:

Age

Income

Interest Rate

Stock Prices

etc.

Okay for Ordinal Data

Not for Nominal Data

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Descriptive Measures - Summary

Descriptive Measure Computation Method Data Level Advantages/Disadvantages
Mean Sum of values divided by the number of values Ratio Interval Numerical center of the data Sum of deviations from the mean is zero Sensitive to extreme values
Median Middle value for data that have been sorted Ratio Interval Ordinal Not sensitive to extreme values Computed only from the center values Does not use information from all the data
Mode Value(s) that occur most frequently in the data Ratio Interval Ordinal Nominal May not reflect the center May not exist Might have multiple modes

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Comparing Distributions Production Volumes Example

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Production Distributions What Is the Difference?

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Section 3.2 Measures of Dispersion/Variation

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Descriptive Statistics Measures of Dispersion

Range

Simple range

Interquartile Range

Variance

Standard Deviation

Coefficient of Variation

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Range

A measure of variation that is computed by finding the difference between the maximum and minimum values in a data set

R = Maximum Value − Minimum Value

Simplest measure of variation

Is very sensitive to extreme values

Ignores the data distribution

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Measures of Dispersion - Staff Years of Experience

Raw Data

Values are the number of years experience for a sample of staff members.

13 16 16 13 13 9
12 15 8 12 13 14
11 6 15 11 15 13
12 12 13 13 13 11
17 14 11 14 13 15
11 11 13 15 9 10
11 12 11 16 11 10
13 13 11 13 10 13
11 10 8 11 11 17
10 12 13 10 10 15
14 11 9 15 13 12
11 14 12 10 12 14
14 12 11 11 12 13
11 11 11 13 12 11
12 10 9 12 10 13
12 14 10 10 16 12
11 11 14 14 Blank Blank

n = 100

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Finding the Range

Years Experience

Range = High − Low

Range = 17 − 6 = 11

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Characteristics of the Range

Easy to Compute

Uses Only Two Values (high and low)

Sensitive to Extremes

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Interquartile Range

A measure of variation that is determined by computing the difference between the third and first quartiles

Interquartile Range

Eliminates outlier problems

Eliminates some high- and low-valued observations

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Interquartile Range Example

Interquartile range = 57 − 30 = 27

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Population Variance

The average of the squared distances of the data values from the mean.

- population mean, N – population size

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Population Variance - Example

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Measures of Dispersion Production Volume Example

Departments the same in terms of the center but differ in degree of dispersion

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Computing the Variance Department A

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Computing the Variance Department B

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Comparing the Dispersion Dept. A vs Dept. B

The variance is measured in units squared.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Population Standard Deviation

The most commonly used measure of variation

The positive square root of the variance

Has the same units as the original data

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Standard Deviation (The Square Root of the Variance)

Population Standard Deviation

Dept. A

Dept. B

Standard deviation is the measure of average deviation from the mean.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Population Standard Deviation – Example Using Excel 2016

The Convention Center hosted 20 events last year. This is the population.

The mean attendance is 381.55

The standard deviation is 269.00

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Sample Standard Deviation

sample variance

sample standard deviation

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Sample Standard Deviation Restaurant Sales Example

Restaurant Sales Values (n = 7)

$30 $19.50 $22.40 $27 $17.50 $25.40 $23.50

s = $4.31

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Sample Standard Deviation – Example Using Excel 2016 (1 of 2)

Sample of starting salaries for college graduates

n = 7

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Sample Standard Deviation – Example Using Excel 2016 (2 of 2)

Fitness Club Membership Ages

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Interpreting the Standard Deviation Health Club Example

Standard deviation is a measure of distance from the mean.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Section 3.3 Using the Mean and Standard Deviation Together

Coefficient of Variation (C V)

The ratio of the standard deviation to the mean expressed as a percentage. The coefficient of variation is used to measure variation relative to the mean.

Measures relative variation

Always expressed in percentage (%)

Shows variation relative to mean

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Coefficient of Variation

Is used to compare two or more sets of data measured in different units

Population C V

Sample C V

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Comparing Coefficients of Variation

Stock A:

Average Price = $50 =

Standard Deviation = $5 = s

Stock B:

Average Price = $100 =

Standard Deviation = $5 = s

Both stocks have the same standard deviation, but stock B is less variable relative to its price.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

The Empirical Rule (1 of 2)

If the data distribution is bell shaped, then the interval

contains approximately 68% of the values

contains approximately 95% of the values

contains virtually all of the data values.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

The Empirical Rule (2 of 2)

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

The Empirical Rule - Example

The travel distances for the Uber service in a major city is a bell shaped distributed with a mean of 15.1 miles and standard deviation of 3.1 miles per trip. Use the Empirical Rule to describe the distribution.

Assuming a bell shaped distribution, 68% percent of the trips should be between:

Assuming a bell shaped distribution, 95% percent of the trips should be between:

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Tchebysheff’s Theorem

Regardless of how data are distributed, at least

of the values will fall within k standard

deviations of the mean.

Examples:

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Standardized Data Values

The number of standard deviations a value is from the mean

Standardized data values are also referred to as z scores.

Population z score

Sample z score

x – data value

μ – population mean

σ – population standard deviation

– sample mean

s – sample standard deviation

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Converting Data to Standardized Values

Step 1: Collect the population or sample values for the quantitative variable of interest.

Step 2: Compute the population mean and standard deviation or the sample mean and standard deviation.

Step 3: Convert the values to standardized z-values.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Standardized Value Calculation - Example

I Q Scores have a distribution thought to be bell shaped with a mean equal to 100 and standard deviation equal to 15. Suppose a person has an I Q of 121, calculate the standardized I Q.

A person with an I Q of 121 has an I Q that is 1.40 standard deviations higher than the population mean I Q.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Data Standardization – Example University Entrance Test

How did you do on a relative basis at each university?

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Entrance Test Scores

Entrance Score Was Relatively Better at University A than at University B

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

98

University Entrance Score Analysis

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

Coefficient of Variation (C V) – Example University Entrance Scores

The Coefficient of Variation measures the relative dispersion in a set of data. The C V is used to compare two or more sets of data with respect to variation.

Population

University A

University B

University A had more variable entrance scores.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved

Slide - ‹#›

1

N

i

i

N

x

m

=

=

å

th

-

i

xi

10

N

=

i

N

x

m

=

å

216255330.....310295

10

++++

=

2,884

10

=

288.4

=

1

x

2

x

3

x

x

N

m

=

å

2213101623131113

8

+++++++

=

121

8

=

15.125

m

=

1

n

i

i

x

x

n

=

=

å

x

1

n

i

i

x

x

n

=

=

å

20510405090

10

+++++

=

K

300

30.0

10

==

1

300

10

10

n

i

i

x

x

n

=

===

å

30

X

x

n

==

å

X

x

n

=

å

60

=

1

2

in

=

1

2

in

=

(

)

1

73.5

2

i

==

4

i

=

(

)

2020

Median  20

2

+

==

1

2

in

=

1

(10)5

2

i

==

(

)

2020

Median  20

2

+

==

(

)

2020

Median  20

2

+

==

1 1 1 1 1 1 2 2 3 3 3 4 4 5

ii

w

i

wx

w

m

=

å

å

ii

w

i

wx

x

w

=

å

å

i

w

i

x

(7.2)(2,000)(8.3)(5,000)(5.4)(12,000)

6.35%

2,0005,00012,000

w

wx

u

w

++

===

++

å

å

(0100)

p

££

where

()

100

p

in

=

80

()(30)24

100100

p

in

===

20.521

20.75

2

+

=

36404246515662657174788284878890929597

25

()(19)4.75

100100

q

in

===

1

Q

1

()

Q

3

().

Q

1313

.

QQQQ

andbelowandabove

123

,,.

QQQ

13

.

QQ

and

(

)

31

.

IQRQQ

=-

13

.

QQ

to

14342

14

2.8????

5

X

x

n

===

å

9

n

=

27

3????

9

X

x

n

===

å

31

QQ

=-

2

2

1

()

N

i

i

N

x

m

s

=

=

-

å

m

(

)

.

-

å

2

2

2

40

5

m

s

===

X

N

(

)

-

å

2

2

400

80

5

m

s

===

X

N

2

2

1

()

N

i

i

N

x

m

s

s

=

==

-

å

(

)

2

x

N

m

s

-

=

å

.

=

2

40

s

.40.632

s

==

=

2

80

s

808.94

s

==

2

2

( )

1

xx

s

n

-

=

-

å

2

()

=

1

xx

s

n

-

-

å

165.3

23.61

7

x

x

n

===

å

(

)

(

)

(

)

222

2

()

3023.6119.523.61....23.523.61

171

xx

s

n

-

-+-+-

==

--

å

11.18

34.84

s

x

=

=

(100)%

CV

s

m

=

(100)%

s

CV

x

=

x

5

(100)(100)10%

50

s

CV

x

===

x

5

(100)(100)5%

100

s

CV

x

===

1

ms

±

2

ms

±

3

ms

±

15.1 miles; 3.1 miles

ms

==

(1)() 15.1(1)(3.1) 12.0 miles --

------- 18.2 miles

ms

±±

(1)() 15.1(2)(3.1) 8.9 miles ---

------ 21.3 miles

ms

±±

2

1

1

k

æö

-

ç÷

èø

(

)

(

)

(

)

2

2

2

1

10%...........11

1

1

175%..........22

2

1

189%.......33

3

k

k

k

ms

ms

ms

æö

-==±

ç÷

èø

æö

-==±

ç÷

èø

æö

-==±

ç÷

èø

atleast

atleast

atleast

x

z

m

s

-

=

xx

z

s

-

=

x

100 15

ms

==

121100

1.40

15

x

z

m

s

--

===

x

z

m

s

-

=

(100)

CV

s

m

=

70

10

m

s

=

=

10

(100)

70

14.2%

CV

=

=

750

70

m

s

=

=

70

(100)

750

9.3%

CV

=

=