QNT/351 Descriptive Statistics – Real Estate Data Part 1

profileupaovfgiro
chapter_3.pdf

Page 50

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted.

THE KENTUCKY DERBY is held the first Saturday in May at Churchill Downs in Louisville, Kentucky. The race track is one and one-quarter miles. The table in Exercise 82 shows the winners since 1990, their margin of victory, the winning time, and the payoff on a $2 bet. Determine the mean and median for the variables winning time and payoff on a $2 bet. (See Exercise 82 and LO 3-1.)

Learning Objectives When you have completed this chapter, you will be able to:

LO 3-1. Compute and interpret the mean, the median, and the mode.

LO 3-2. Compute a weighted mean.

LO 3-3. Compute and interpret the geometric mean.

LO 3-4. Compute and interpret the range, variance, and standard deviation.

LO 3-5. Explain and apply Chebyshev’s theorem and the Empirical Rule.

LO 3-6. Compute the mean and standard deviation of grouped data.

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=50...

1 of 2 4/9/2016 8:56 AM

Page 51

Did you ever meet the “average” American man? Well, his name is Robert (that is the nominal level of measurement) and he is 31 years old (that is the ratio level), is 69.5 inches tall (again the ratio level of measurement), weighs 172 pounds, wears a size 9½ shoe, has a 34-inch waist, and wears a size 40 suit. In addition, the average man eats 4 pounds of potato chips, watches 1,456 hours of TV, and eats 26 pounds of bananas each year, and also sleeps 7.7 hours per night.

The average American woman is 5′ 4″ tall and weighs 140 pounds, while the average American model is 5′ 11″ tall and weighs 117 pounds. On any given day, almost half of the women in the United States are on a diet. Idolized in the 1950s, Marilyn Monroe would be considered over-weight by today’s standards. She fluctuated between a size 14 and a size 18 dress, and was a healthy and attractive woman.

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=50...

2 of 2 4/9/2016 8:56 AM

Page 51

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted. INTRODUCTION

Chapter 2 began our study of descriptive statistics. To summarize raw data into a meaningful form, we organized qualitative data into a frequency table and portrayed the results in a bar chart. In a similar fashion, we organized quantitative data into a frequency distribution and portrayed the results in a histogram. We also looked at other graphical techniques such as pie charts to portray qualitative data and frequency polygons to portray quantitative data.

This chapter is concerned with two numerical ways of describing quantitative variables, namely, measures of location and measures of dispersion. Measures of location are often referred to as averages. The purpose of a measure of location is to pinpoint the center of a distribution of data. An average is a measure of location that shows the central value of the data. Averages appear daily on TV, on various websites, in the newspaper, and in other journals. Here are some examples:

The average U.S. home changes ownership every 11.8 years.

An American receives an average of 568 pieces of mail per year.

The average American home has more TV sets than people. There are 2.73 TV sets and 2.55 people in the typical home.

The average American couple spends $20,398 for their wedding, while their budget is 50% less. This does not include the cost of a honeymoon or engagement ring.

The average price of a theater ticket in the United States is $7.50, according to the National Association of Theatre Owners.

If we consider only measures of location in a set of data, or if we compare several sets of data using central values, we may draw an erroneous conclusion. In addition to measures of location, we should consider the dispersion—often called the variation or the spread—in the data. As an illustration, suppose the average annual income of executives for Internet-related companies is $80,000, and the average income for executives in pharmaceutical firms is also $80,000. If we looked only at the average incomes, we might wrongly conclude that the distributions of the two salaries are the same. However, we need to examine the dispersion or spread of the distributions of salary. A look at the salary ranges indicates that this conclusion of equal distributions is not correct. The salaries for the executives in the Internet firms range from $70,000 to $90,000, but salaries for the marketing executives in pharmaceuticals range from $40,000 to $120,000. Thus,

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=51...

1 of 2 4/9/2016 9:00 AM

Page 52

we conclude that although the average salaries are the same for the two industries, there is much more spread or dispersion in salaries for the pharmaceutical executives. To describe the dispersion, we will consider the range, the variance, and the standard deviation.

LO3-1

Compute and interpret the mean, the median, and the mode.

MEASURES OF LOCATION We begin by discussing measures of location. There is not just one measure of location; in fact, there are many. We will consider five: the arithmetic mean, the median, the mode, the weighted mean, and the geometric mean. The arithmetic mean is the most widely used and widely reported measure of location. We study the mean as both a population parameter and a sample statistic.

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=51...

2 of 2 4/9/2016 9:00 AM

Page 52

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted. The Population Mean

Many studies involve all the values in a population. For example, there are 12 sales associates employed at the Reynolds Road Carpet Outlet. The mean amount of commission they earned last month was $1,345. This is a population value because we considered the commission of all the sales associates. Other examples of a population mean would be:

The mean closing price for Johnson & Johnson stock for the last 5 days is $64.75.

The mean number of hours of overtime worked last week by the six welders in the welding department of Butts Welding Inc. is 6.45 hours.

Caryn Tirsch began a website last month devoted to organic gardening. The mean number of hits on her site for the 31 days in July was 84.36.

For raw data—that is, data that have not been grouped in a frequency distribution—the population mean is the sum of all the values in the population divided by the number of values in the population. To find the population mean, we use the following formula.

Instead of writing out in words the full directions for computing the population mean (or any other measure), it is more convenient to use the shorthand symbols of mathematics. The mean of the population using mathematical symbols is:

where:

μ represents the population mean. It is the Greek lowercase letter “mu.”

N is the number of values in the population.

x represents any particular value.

Σ is the Greek capital letter “sigma” and indicates the operation of adding.

Σx is the sum of the x values in the population.

Any measurable characteristic of a population is called a parameter. The mean of a population is an example of a parameter.

PARAMETER A characteristic of a population.

E X A M P L E

There are 42 exits on I-75 through the state of Kentucky. Listed below are the distances between exits (in miles).

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=52...

1 of 2 4/9/2016 9:01 AM

Page 53

Why is this information a population? What is the mean number of miles between exits?

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=52...

2 of 2 4/9/2016 9:01 AM

Page 53

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted.

S O L U T I O N

This is a population because we are considering all the exits on I-75 in Kentucky. We add the distances between each of the 42 exits. The total distance is 192 miles. To find the arithmetic mean, we divide this total by 42. So the arithmetic mean is 4.57 miles, found by 192/42. From formula (3–1):

How do we interpret the value of 4.57? It is the typical number of miles between exits. Because we considered all the exits on I-75 in Kentucky, this value is a population parameter.

The Sample Mean As explained in Chapter 1, we often select a sample from the population to estimate a specific characteristic of the population. Smucker’s quality assurance department needs to be assured that the amount of orange marmalade in the jar labeled as containing 12 ounces actually contains that amount. It would be very expensive and time-consuming to check the weight of each jar. Therefore, a sample of 20 jars is selected, the mean of the sample is determined, and that value is used to estimate the amount in each jar.

For raw data—that is, ungrouped data—the mean is the sum of all the sampled values divided by the total number of sampled values. To find the mean for a sample:

The mean of a sample and the mean of a population are computed in the same way, but the shorthand notation used is different. The formula for the mean of a sample is:

where:

represents the sample mean. It is read “x bar.”

n is the number of values in the sample.

x represents any particular value.

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=53...

1 of 2 4/9/2016 9:03 AM

Page 54

Σ is the Greek capital letter “sigma” and indicates the operation of adding.

Σx is the sum of the x values in the sample.

The mean of a sample, or any other measure based on sample data, is called a statistic. If the mean weight of a sample of 10 jars of Smucker’s orange marmalade is 11.5 ounces, this is an example of a statistic.

STATISTIC A characteristic of a sample.

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=53...

2 of 2 4/9/2016 9:03 AM

Page 54

Page 55

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted.

E X A M P L E

Verizon is studying the number of minutes used by clients in a particular cell phone rate plan. A random sample of 12 clients showed the following number of minutes used last month.

What is the arithmetic mean number of minutes used?

S O L U T I O N

Using formula (3–2), the sample mean is:

The arithmetic mean number of minutes used last month by the sample of cell phone users is 97.5 minutes.

Properties of the Arithmetic Mean The arithmetic mean is a widely used measure of location. It has several important properties:

1. To compute a mean, the data must be measured at the interval or ratio level. Recall from Chapter 1 that ratio-level data include such data as ages, incomes, and weights, with the distance between numbers being constant.

2. All the values are included in computing the mean.

3. The mean is unique. That is, there is only one mean in a set of data. Later in the chapter, we will discover a measure of location that might appear twice, or more than twice, in a set of data.

4. The sum of the deviations of each value from the mean is zero. Expressed symbolically:

As an example, the mean of 3, 8, and 4 is 5. Then:

Thus, we can consider the mean as a balance point for a set of data. To illustrate, we have a long board with the numbers 1, 2, 3, . . . , 9 evenly spaced on it. Suppose three bars of equal weight were placed on the board at numbers 3, 4, and 8, and the balance point was set at 5, the mean of the three numbers. We would find that the

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=54...

1 of 1 4/9/2016 9:04 AM

Page 55

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted. board is balanced perfectly! The deviations below the mean (−3) are equal to the deviations above the mean (+3). Shown schematically:

The mean does have a weakness. Recall that the mean uses the value of every item in a sample, or population, in its computation. If one or two of these values are either extremely large or extremely small compared to the majority of data, the mean might not be an appropriate average to represent the data. For example, suppose the annual incomes of a sample of financial planners at Merrill Lynch are $62,900, $61,600, $62,500, $60,800, and $1,200,000. The mean income is $289,560. Obviously, it is not representative of this group because all but one financial planner has an income in the $60,000 to $63,000 range. One income ($1.2 million) is unduly affecting the mean.

3-1

1. The annual incomes of a sample of middle-management employees at Westinghouse are $62,900, $69,100, $58,300, and $76,800.

(a) Give the formula for the sample mean.

(b) Find the sample mean.

(c) Is the mean you computed in (b) a statistic or a parameter? Why?

(d) What is your best estimate of the population mean?

2. All the students in advanced Computer Science 411 are a population. Their course grades are 92, 96, 61, 86, 79, and 84.

(a) Give the formula for the population mean.

(b) Compute the mean course grade.

(c) Is the mean you computed in (b) a statistic or a parameter? Why?

E X E R C I S E S

The answers to the odd-numbered exercises are in Appendix D.

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=55...

1 of 2 4/9/2016 9:04 AM

Page 56

For DATA FILE, please visit www.mhhe.com/lind16e

1. Compute the mean of the following population values: 6, 3, 5, 7, 6.

2. Compute the mean of the following population values: 7, 5, 7, 3, 7, 4.

3. a. Compute the mean of the following sample values: 5, 9, 4, 10.

b. Show that .

4. a. Compute the mean of the following sample values: 1.3, 7.0, 3.6, 4.1, 5.0.

b. Show that .

5. Compute the mean of the following sample values: 16.25, 12.91, 14.58.

6. Suppose you go to the grocery store and spend $61.85 for the purchase of 14 items. What is the mean price per item?

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=55...

2 of 2 4/9/2016 9:04 AM

Page 56

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted. For Exercises 7–10, (a) compute the arithmetic mean and (b) indicate whether it is a statistic or a parameter.

7. There are 10 salespeople employed by Midtown Ford. The number of new cars sold last month by the respective salespeople were 15, 23, 4, 19, 18, 10, 10, 8, 28, 19.

8. The accounting department at a mail-order company counted the following numbers of incoming calls per day to the company’s toll-free number during the first 7 days in May: 14, 24, 19, 31, 36, 26, 17.

9. The Cambridge Power and Light Company selected a random sample of 20 residential customers. Following are the amounts, to the nearest dollar, the customers were charged for electrical service last month:

10. The Human Relations Director at Ford began a study of the overtime hours in the Inspection Department. A sample of 15 workers showed they worked the following number of overtime hours last month.

11. AAA Heating and Air Conditioning completed 30 jobs last month with a mean revenue of $5,430 per job. The president wants to know the total revenue for the month. Based on the limited information, can you compute the total revenue? What is it?

12. A large pharmaceutical company hires business administration graduates to sell its products. The company is growing rapidly and dedicates only 1 day of sales training for new salespeople. The company’s goal for new salespeople is $10,000 per month. The goal is based on the current mean sales for the entire company, which is $10,000 per month. After reviewing the retention rates of new employees, the company finds that only 1 in 10 new employees stays longer than 3 months. Comment on using the current mean sales per month as a sales goal for new employees. Why do new employees leave the company?

The Median We have stressed that, for data containing one or two very large or very small values, the arithmetic mean may not be representative. The center for such data can be better described by a measure of location called the median.

To illustrate the need for a measure of location other than the arithmetic mean, suppose you are seeking to buy a condominium in Palm Aire. Your real estate agent says that the typical price of the units currently available is $110,000. Would you still want to look? If you had budgeted your maximum purchase price at $75,000, you might think they are out of your price range. However, checking the prices of the individual units might change your mind. They are $60,000, $65,000, $70,000, and $80,000, and a superdeluxe penthouse costs $275,000. The arithmetic mean price is $110,000, as the real estate agent reported, but one price ($275,000) is pulling the arithmetic mean upward, causing it to be an unrepresentative average. It does seem that a price around $70,000 is a more typical or representative average, and it is. In cases such as this, the median provides a more valid measure of location.

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=56...

1 of 2 4/9/2016 9:04 AM

Page 57

MEDIAN The midpoint of the values after they have been ordered from the minimum to the maximum values.

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=56...

2 of 2 4/9/2016 9:04 AM

Page 57

Page 58

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted. The median price of the units available is $70,000. To determine this, we order the prices from the minimum value ($60,000) to the maximum value ($275,000) and select the middle value ($70,000). For the median, the data must be at least an ordinal level of measurement.

Note that there is the same number of prices below the median of $70,000 as above it. The median is, therefore, unaffected by extremely low or high prices. Had the highest price been $90,000, or $300,000, or even $1 million, the median price would still be $70,000. Likewise, had the lowest price been $20,000 or $50,000, the median price would still be $70,000.

In the previous illustration, there are an odd number of observations (five). How is the median determined for an even number of observations? As before, the observations are ordered. Then by convention to obtain a unique value we calculate the mean of the two middle observations. So for an even number of observations, the median may not be one of the given values.

E X A M P L E

Facebook is a popular social networking website. Users can add friends and send them messages, and update their personal profiles to notify friends about themselves and their activities. A sample of 10 adults revealed they spent the following number of hours last month using Facebook.

Find the median number of hours.

S O L U T I O N

Note that the number of adults sampled is even (10). The first step, as before, is to order the hours using Facebook from the minimum value to the maximum value. Then identify the two middle times. The arithmetic mean of the two middle observations gives us the median hours. Arranging the values from minimum to maximum:

The median is found by averaging the two middle values. The middle values are 5 hours and 7 hours, and the mean of these two values is 6. We conclude that the typical adult Facebook user spends 6 hours per month at the website. Notice that the median is not one of the values. Also, half of the times are below the median and half are above it.

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=57...

1 of 1 4/9/2016 9:05 AM

Page 58

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted. The major properties of the median are:

1. It is not affected by extremely large or small values. Therefore, the median is a valuable measure of location when such values do occur.

2. It can be computed for ordinal-level data or higher. Recall from Chapter 1 that ordinal-level data can be ranked from low to high.

The Mode The mode is another measure of location.

MODE The value of the observation that appears most frequently.

The mode is especially useful in summarizing nominal-level data. As an example of its use for nominal-level data, a company has developed five bath oils. The bar chart in Chart 3–1 shows the results of a marketing survey designed to find which bath oil consumers prefer. The largest number of respondents favored Lamoure, as evidenced by the highest bar. Thus, Lamoure is the mode.

CHART 3–1 Number of Respondents Favoring Various Bath Oils

E X A M P L E

Recall the data regarding the distance in miles between exits on I-75 in Kentucky. The information is repeated below.

What is the modal distance?

S O L U T I O N

The first step is to organize the distances into a frequency table. This will help us determine the distance that occurs most frequently.

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=58...

1 of 2 4/9/2016 9:06 AM

Page 59

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=58...

2 of 2 4/9/2016 9:06 AM

Page 59

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted.

The distance that occurs most often is 1 mile. This happens eight times—that is, there are eight exits that are 1 mile apart. So the modal distance between exits is 1 mile.

Which of the three measures of location (mean, median, or mode) best represents the central location of these data? Is the mode the best measure of location to represent the Kentucky data? No. The mode assumes only the nominal scale of measurement and the variable miles is measured using the ratio scale. We calculated the mean to be 4.57 miles. See page 53. Is the mean the best measure of location to represent these data? Probably not. There are several cases in which the distance between exits is large. These values are affecting the mean, making it too large and not representative of the distances between exits. What about the median? The median distance is 3 miles. That is, half of the distances between exits are 3 miles or less. In this case, the median of 3 miles between exits is probably a more representative measure of the distance between exits.

In summary, we can determine the mode for all levels of data—nominal, ordinal, interval, and ratio. The mode also has the advantage of not being affected by extremely high or low values.

The mode does have disadvantages, however, that cause it to be used less frequently than the mean or median. For many sets of data, there is no mode because no value appears more than once. For example, there is no mode for this set of price data because every value occurs once: $19, $21, $23, $20, and $18. Conversely, for some data sets there is more than one mode. Suppose the ages of the individuals in a stock investment club are 22, 26, 27, 27, 31, 35, and 35. Both the ages 27 and 35 are modes. Thus, this grouping of ages is referred to as bimodal (having two modes). One would question the use of two modes to represent the location of this set of age data.

3-2

1. A sample of single persons in Towson, Texas, receiving Social Security payments revealed these monthly benefits: $852, $598, $580, $1,374, $960, $878, and $1,130.

(a) What is the median monthly benefit?

(b) How many observations are below the median? Above it?

2. The number of work stoppages in the automobile industry for selected months are 6, 0, 10, 14, 8, and 0.

(a) What is the median number of stoppages?

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=59...

1 of 2 4/9/2016 9:07 AM

Page 60

(b) How many observations are below the median? Above it?

(c) What is the modal number of work stoppages?

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=59...

2 of 2 4/9/2016 9:07 AM

Page 60

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted. EXERCISES

For DATA FILE, please visit www.mhhe.com/lind16e

13. What would you report as the modal value for a set of observations if there were a total of:

a. 10 observations and no two values were the same?

b. 6 observations and they were all the same?

c. 6 observations and the values were 1, 2, 3, 3, 4, and 4?

For Exercises 14–16, determine the (a) mean, (b) median, and (c) mode.

14. The following is the number of oil changes for the last 7 days at the Jiffy Lube located at the corner of Elm Street and Pennsylvania Avenue.

15. The following is the percent change in net income from last year to this year for a sample of 12 construction companies in Denver.

16. The following are the ages of the 10 people in the video arcade at the Southwyck Shopping Mall at 10 a.m.

17. Several indicators of long-term economic growth in the United States and their annual percent change are listed below.

a. What is the median percent change?

b. What is the modal percent change?

18. Sally Reynolds sells real estate along the coastal area of Northern California. Below are her total annual commissions between 2002 and 2012. Find the mean, median, and mode of the commissions she earned for the 11 years.

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=60...

1 of 2 4/9/2016 9:07 AM

Page 61

19. The accounting firm of Rowatti and Koppel specializes in income tax returns for self-employed professionals, such as physicians, dentists, architects, and lawyers. The firm employs 11 accountants who prepare the returns. For last year, the number of returns prepared by each accountant was:

Find the mean, median, and mode for the number of returns prepared by each accountant. If you could report only one, which measure of location would you recommend reporting?

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=60...

2 of 2 4/9/2016 9:07 AM

Page 61

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted. 20. The demand for the video games provided by Mid-Tech Video Games Inc. has exploded in the last several years. Hence, the owner needs to hire several new technical people to keep up with the demand. Mid-Tech gives each applicant a special test that Dr. McGraw, the designer of the test, believes is closely related to the ability to create video games. For the general population, the mean on this test is 100. Below are the scores on this test for the applicants.

The president is interested in the overall quality of the job applicants based on this test. Compute the mean and the median scores for the 10 applicants. What would you report to the president? Does it seem that the applicants are better than the general population?

The Relative Positions of the Mean, Median, and Mode Refer to the histogram in Chart 3–2. It is a symmetric distribution, which is also mound-shaped. This distribution has the same shape on either side of the center. If the histogram were folded in half, the two halves would be identical. For any symmetric distribution, the mode, median, and mean are located at the center and are always equal. They are all equal to 30 years in Chart 3–2. We should point out that there are symmetric distributions that are not mound-shaped.

CHART 3–2 A Symmetric Distribution

The number of years corresponding to the highest point of the curve is the mode (30 years). Because the distribution is symmetrical, the median corresponds to the point where the distribution is cut in half (30 years). Also, because the arithmetic mean is the balance point of a distribution (as shown on page 55), and the distribution is symmetric, the arithmetic mean is 30. Logically, any of the three measures would be appropriate to represent the distribution’s center.

If a distribution is nonsymmetrical, or skewed, the relationship among the three measures changes. In a positively skewed distribution, such as the distribution of weekly income in Chart 3–3, the arithmetic mean is the largest of the three measures. Why? Because the mean is influenced more than the median or mode by a few extremely high values. The median is generally the next largest measure in a positively skewed frequency distribution. The mode is the smallest of the three measures.

If the distribution is highly skewed, the mean would not be a good measure to use. The median and mode

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=61...

1 of 2 4/9/2016 9:08 AM

Page 62

would be more representative.

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=61...

2 of 2 4/9/2016 9:08 AM

Page 62

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted.

CHART 3–3 A Positively Skewed Distribution

Conversely, if a distribution is negatively skewed, such as the distribution of tensile strength in Chart 3–4, the mean is the lowest of the three measures. The mean is, of course, influenced by a few extremely low observations. The median is greater than the arithmetic mean, and the modal value is the largest of the three measures. Again, if the distribution is highly skewed, the mean should not be used to represent the data.

CHART 3–4 A Negatively Skewed Distribution

3-3

The weekly sales from a sample of Hi-Tec electronic supply stores were organized into a frequency distribution. The mean of weekly sales was computed to be $105,900, the median $105,000, and the mode $104,500.

(a) Sketch the sales in the form of a smoothed frequency polygon. Note the location of the mean, median, and mode on the X-axis.

(b) Is the distribution symmetrical, positively skewed, or negatively skewed? Explain.

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=62...

1 of 2 4/9/2016 9:08 AM

Page 63

E X E R C I S E S

For DATA FILE, please visit www.mhhe.com/lind16e

21. The unemployment rate in the state of Alaska by month is given in the table below:

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=62...

2 of 2 4/9/2016 9:08 AM

Page 63

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted. a. What is the arithmetic mean of the Alaska unemployment rates?

b. Find the median and the mode for the unemployment rates.

c. Compute the arithmetic mean and median for just the winter (Dec–Mar) months. Is it much different?

22. Big Orange Trucking is designing an information system for use in “in-cab” communications. It must summarize data from eight sites throughout a region to describe typical conditions. Compute an appropriate measure of central location for the variables wind direction, temperature, and pavement.

Software Solution We can use a statistical software package to find many measures of location.

E X A M P L E

Table 2–4 on page 26 shows the profit on the sales of 180 vehicles at Applewood Auto Group. Determine the mean and the median selling price.

S O L U T I O N

The mean, median, and modal amounts of profit are reported in the following output (highlighted in the screen shot). (Reminder: The instructions to create the output appear in the Software Commands in Appendix C.) There are 180 vehicles in the study, so using a calculator would be tedious and prone to error.

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=63...

1 of 2 4/9/2016 9:08 AM

Page 64

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=63...

2 of 2 4/9/2016 9:08 AM

Page 64

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted. The mean profit is $1,843.17 and the median is $1,882.50. These two values are less than $40 apart, so either value is reasonable. We can also see from the Excel output that there were 180 vehicles sold and their total profit was $331,770.00. We will describe the meaning of standard error, standard deviation, and other measures reported on the output later in this chapter and in later chapters.

What can we conclude? The typical profit on a vehicle is about $1,850. Management at Applewood might use this value for revenue projections. For example, if the dealership could increase the number sold in a month from 180 to 200, this would result in an additional estimated $37,000 of revenue, found by 20($1,850).

LO3-2

Compute a weighted mean.

THE WEIGHTED MEAN The weighted mean is a convenient way to compute the arithmetic mean when there are several observations of the same value. To explain, suppose the nearby Wendy’s Restaurant sold medium, large, and Biggie-sized soft drinks for $.90, $1.25, and $1.50, respectively. Of the last 10 drinks sold, 3 were medium, 4 were large, and 3 were Biggie-sized. To find the mean price of the last 10 drinks sold, we could use formula (3–2).

The mean selling price of the last 10 drinks is $1.22.

An easier way to find the mean selling price is to determine the weighted mean. That is, we multiply each observation by the number of times it happens. We will refer to the weighted mean as . This is read “x bar sub w.”

In this case, the weights are frequency counts. However, any measure of importance could be used as a weight. In general, the weighted mean of a set of numbers designated x1, x2, x3, . . . , xn with the corresponding weights w1, w2, w3, . . . , wn is computed by:

This may be shortened to:

Note that the denominator of a weighted mean is always the sum of the weights.

E X A M P L E

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=64...

1 of 2 4/9/2016 9:09 AM

Page 65

The Carter Construction Company pays its hourly employees $16.50, $19.00, or $25.00 per hour. There are 26 hourly employees, 14 of whom are paid at the $16.50 rate, 10 at the $19.00 rate, and 2 at the $25.00 rate. What is the mean hourly rate paid the 26 employees?

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=64...

2 of 2 4/9/2016 9:09 AM

Page 65

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted.

S O L U T I O N

To find the mean hourly rate, we multiply each of the hourly rates by the number of employees earning that rate. From formula (3–3), the mean hourly rate is

The weighted mean hourly wage is rounded to $18.12.

3-4

Springers sold 95 Antonelli men’s suits for the regular price of $400. For the spring sale, the suits were reduced to $200 and 126 were sold. At the final clearance, the price was reduced to $100 and the remaining 79 suits were sold.

(a) What was the weighted mean price of an Antonelli suit?

(b) Springers paid $200 a suit for the 300 suits. Comment on the store’s profit per suit if a salesperson receives a $25 commission for each one sold.

E X E R C I S E S

23. In June, an investor purchased 300 shares of Oracle (an information technology company) stock at $20 per share. In August, she purchased an additional 400 shares at $25 per share. In November, she purchased an additional 400 shares, but the stock declined to $23 per share. What is the weighted mean price per share?

24. The Bookstall Inc. is a specialty bookstore concentrating on used books sold via the Internet. Paperbacks are $1.00 each, and hardcover books are $3.50. Of the 50 books sold last Tuesday morning, 40 were paperback and the rest were hardcover. What was the weighted mean price of a book?

25. The Loris Healthcare System employs 200 persons on the nursing staff. Fifty are nurse’s aides, 50 are practical nurses, and 100 are registered nurses. Nurse’s aides receive $8 an hour, practical nurses $15 an hour, and registered nurses $24 an hour. What is the weighted mean hourly wage?

26. Andrews and Associates specialize in corporate law. They charge $100 an hour for researching a case, $75 an hour for consultations, and $200 an hour for writing a brief. Last week one of the associates spent 10 hours consulting with her client, 10 hours researching the case, and 20 hours writing the brief. What was the weighted mean hourly charge for her legal services?

LO3-3

Compute and interpret the geometric mean.

THE GEOMETRIC MEAN

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=65...

1 of 2 4/9/2016 9:09 AM

Page 66

The geometric mean is useful in finding the average change of percentages, ratios, indexes, or growth rates over time. It has a wide application in business and economics because we are often interested in finding the percentage changes in sales, salaries, or economic figures, such as the gross domestic product, which compound or build on each other. The geometric mean of a set of n positive numbers is defined as the nth root of the product of n values. The formula for the geometric mean is written:

The geometric mean will always be less than or equal to (never more than) the arithmetic mean. Also, all the data values must be positive.

As an example of the geometric mean, suppose you receive a 5% increase in salary this year and a 15% increase next year. The average annual percent increase is

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=65...

2 of 2 4/9/2016 9:09 AM

Page 66

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted. 9.886, not 10.0. Why is this so? We begin by calculating the geometric mean. Recall, for example, that a 5% increase in salary is 105%. We will write it as 1.05.

This can be verified by assuming that your monthly earning was $3,000 to start and you received two increases of 5% and 15%.

Your total salary increase is $622.50. This is equivalent to:

The following example shows the geometric mean of several percentages.

E X A M P L E

The return on investment earned by Atkins Construction Company for four successive years was 30%, 20%, −40%, and 200%. What is the geometric mean rate of return on investment?

S O L U T I O N

The number 1.3 represents the 30% return on investment, which is the “original” investment of 1.0 plus the “return” of 0.3. The number 0.6 represents the loss of 40%, which is the original investment of 1.0 less the loss of 0.4. This calculation assumes the total return each period is reinvested or becomes the base for the next period. In other words, the base for the second period is 1.3 and the base for the third period is (1.3)(1.2) and so forth.

Then the geometric mean rate of return is 29.4%, found by

The geometric mean is the fourth root of 2.808. So, the average rate of return (compound annual growth rate) is 29.4%.

Notice also that if you compute the arithmetic mean [(30 + 20 − 40 + 200)/4 = 52.5], you would have a much larger number, which would overstate the true rate of return!

A second application of the geometric mean is to find an average percentage change over a period of time. For example, if you earned $45,000 in 2000 and $100,000 in 2012, what is your annual rate of increase over the period? It is 6.88%. The rate of increase is determined from the following formula.

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=66...

1 of 2 4/9/2016 9:09 AM

Page 67

In the above box, n is the number of periods. An example will show the details of finding the average annual percent increase.

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=66...

2 of 2 4/9/2016 9:09 AM

Page 67

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted.

E X A M P L E

During the decade of the 1990s, and into the 2000s, Las Vegas, Nevada, was the fastest-growing city in the United States. The population increased from 258,295 in 1990 to 584,539 in 2011. This is an increase of 326,244 people, or a 126.3% increase over the period. The population has more than doubled. What is the average annual percent increase?

S O L U T I O N

There are 21 years between 1990 and 2011, so n = 21. Then formula (3–5) for the geometric mean as applied to this problem is:

To summarize, the steps to compute the geometric mean are:

1. Divide the value at the end of the period by the value at the beginning of the period.

2. Find the nth root of the ratio, where n is the number of periods.

3. Subtract one.

The value of .0397 indicates that the average annual growth over the period was 3.97%. To put it another way, the population of Las Vegas increased at a rate of 3.97% per year from 1990 to 2011.

3-5

1. The percent increase in sales for the last 4 years at Combs Cosmetics were 4.91, 5.75, 8.12, and 21.60.

(a) Find the geometric mean percent increase.

(b) Find the arithmetic mean percent increase.

(c) Is the arithmetic mean equal to or greater than the geometric mean?

2. Production of Cablos trucks increased from 23,000 units in 1993 to 120,520 in 2013. Find the geometric mean annual percent increase.

E X E R C I S E S

27. Compute the geometric mean of the following percent increases: 8, 12, 14, 26, and 5.

28. Compute the geometric mean of the following percent increases: 2, 8, 6, 4, 10, 6, 8, and 4.

29. Listed below is the percent increase in sales for the MG Corporation over the last 5 years. Determine the geometric mean percent increase in sales over the period.

30. In 1996, a total of 14,968,000 taxpayers in the United States filed their individual tax returns

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=67...

1 of 2 4/9/2016 9:10 AM

Page 68

electronically. By the year 2010, the number increased to 99,000,000. What is the geometric mean annual increase for the period?

31. The Consumer Price Index is reported monthly by the U.S. Bureau of Labor Statistics. It reports the change in prices for a market basket of goods from one period to another. The index for 2000 was 172.2. By 2012, it increased to 229.6. What was the geometric mean annual increase for the period?

32. JetBlue Airways is an American low-cost airline headquartered in New York City. Its main base is John F. Kennedy International Airport. JetBlue’s revenue in 2002 was $635.2 million. By 2012, revenue had increased to $3,788.0 million. What was the geometric mean annual increase for the period?

33. In 1985, there were 340,213 cell phone subscribers in the United States. By 2012, the number of cell phone subscribers increased to 327,577,529. What is the geometric mean annual increase for the period?

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=67...

2 of 2 4/9/2016 9:10 AM

Page 68

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted. 34. The information below shows the cost for a year of college in public and private colleges in 2002–03 and 2012–13. What is the geometric mean annual increase for the period for the two types of colleges? Compare the rates of increase.

LO3-4

Compute and interpret the range, variance, and standard deviation.

WHY STUDY DISPERSION? A measure of location, such as the mean, median, or mode, only describes the center of the data. It is valuable from that standpoint, but it does not tell us anything about the spread of the data. For example, if your nature guide told you that the river ahead averaged 3 feet in depth, would you want to wade across on foot without additional information? Probably not. You would want to know something about the variation in the depth. Is the maximum depth of the river 3.25 feet and the minimum 2.75 feet? If that is the case, you would probably agree to cross. What if you learned the river depth ranged from 0.50 foot to 5.5 feet? Your decision would probably be not to cross. Before making a decision about crossing the river, you want information on both the typical depth and the dispersion in the depth of the river.

A small value for a measure of dispersion indicates that the data are clustered closely, say, around the arithmetic mean. The mean is therefore considered representative of the data. Conversely, a large measure of dispersion indicates that the mean is not reliable. Refer to Chart 3–5. The 100 employees of Hammond Iron Works Inc., a steel fabricating company, are organized into a histogram based on the number of years of employment with the company. The mean is 4.9 years, but the spread of the data is from 6 months to 16.8 years. The mean of 4.9 years is not very representative of all the employees.

The U.S. Postal Service has tried to become more “user friendly” in the last several years. A recent survey showed that customers were interested in more consistency in the time it takes to make a delivery. Under the old conditions, a local letter might take only one day to deliver, or it might take several. “Just tell me how many days ahead I need to mail the birthday card to Mom so it gets there on her birthday, not early, not late,” was a common complaint. The level of consistency is measured by the standard deviation of the delivery times.

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=68...

1 of 2 4/9/2016 9:10 AM

Page 69

CHART 3–5 Histogram of Years of Employment at Hammond Iron Works Inc.

A second reason for studying the dispersion in a set of data is to compare the spread in two or more distributions. Suppose, for example, that the new Vision Quest LCD computer monitor is assembled in Baton Rouge and also in Tucson. The arithmetic mean hourly output in both the Baton Rouge plant and the Tucson plant is 50. Based on the two means, you might conclude that the distributions of the hourly outputs are identical. Production records for 9 hours at the two plants, however, reveal that this conclusion is not correct (see Chart 3–6). Baton Rouge production varies from 48 to 52 assemblies per hour. Production at the Tucson plant is more erratic, ranging from 40 to 60 per hour. Therefore, the hourly output for Baton Rouge is clustered near the mean of 50; the hourly output for Tucson is more dispersed.

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=68...

2 of 2 4/9/2016 9:10 AM

Page 69

Page 70

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted.

CHART 3–6 Hourly Production of Computer Monitors at the Baton Rouge and Tucson Plants

We will consider several measures of dispersion. The range is based on the maximum and minimum values in the data set; that is, only two values are considered. The variance and the standard deviation use all the values in a data set and are based on deviations from the arithmetic mean.

Range The simplest measure of dispersion is the range. It is the difference between the maximum and minimum values in a data set. In the form of an equation:

The range is widely used in statistical process control (SPC) applications because it is very easy to calculate and understand.

E X A M P L E

Refer to Chart 3–6 above. Find the range in the number of computer monitors produced per hour for the Baton Rouge and the Tucson plants. Interpret the two ranges.

S O L U T I O N

The range of the hourly production of computer monitors at the Baton Rouge plant is 4, found by the difference between the maximum hourly production of

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=69...

1 of 1 4/9/2016 9:10 AM

Page 70

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted. 52 and the minimum of 48. The range in the hourly production for the Tucson plant is 20 computer monitors, found by 60 − 40. We therefore conclude that (1) there is less dispersion in the hourly production in the Baton Rouge plant than in the Tucson plant because the range of 4 computer monitors is less than a range of 20 computer monitors and (2) the production is clustered more closely around the mean of 50 at the Baton Rouge plant than at the Tucson plant (because a range of 4 is less than a range of 20). Thus, the mean production in the Baton Rouge plant (50 computer monitors) is a more representative measure of location than the mean of 50 computer monitors for the Tucson plant.

Variance A defect of the range is that it is based on only two values, the maximum and the minimum; it does not take into consideration all of the values. The variance does. It measures the mean amount by which the values in a population, or sample, vary from their mean. In terms of a definition:

VARIANCE The arithmetic mean of the squared deviations from the mean.

The following example illustrates how the variance is used to measure dispersion.

E X A M P L E

The chart below shows the number of cappuccinos sold at the Starbucks in the Orange County airport and the Ontario, California, airport between 4 and 5 p.m. for a sample of 5 days last month.

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=70...

1 of 2 4/9/2016 9:11 AM

Page 71

Determine the mean, median, range, and variance for each location. Comment on the similarities and differences in these measures.

S O L U T I O N

The mean, median, and range for each of the airport locations are reported below as part of an Excel spreadsheet.

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=70...

2 of 2 4/9/2016 9:11 AM

Page 71

Page 72

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted.

Notice that all three of the measures are exactly the same. Does this indicate that there is no difference in the two sets of data? We get a clearer picture if we calculate the variance. First, for Orange County:

The variance is 400. That is, the average squared deviation from the mean is 400.

The following shows the detail of determining the variance for the number of cappuccinos sold at the Ontario Airport.

So the mean, median, and range of the cappuccinos sold are the same at the two airports, but the variances are different. The variance at Orange County is 400, but it is 370 at Ontario.

Let’s interpret and compare the results of our measures for the two Starbucks airport locations. The mean and median of the two locations are exactly the same,

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=71...

1 of 1 4/9/2016 9:11 AM

Page 72

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted. 50 cappuccinos sold. These measures of location suggest the two distributions are the same. The range for both locations is also the same, 60. However, recall that the range provides limited information about the dispersion because it is based on only two of the observations.

The variances are not the same for the two airports. The variance is based on the differences between each observation and the arithmetic mean. It shows the closeness or clustering of the data relative to the mean or center of the distribution. Compare the variance for Orange County of 400 to the variance for Ontario of 370. Based on the variance, we conclude that the dispersion for the sales distribution of the Ontario Starbucks is more concentrated—that is, nearer the mean of 50—than for the Orange County location.

The variance has an important advantage over the range. It uses all the values in the computation. Recall that the range uses only the highest and the lowest values.

3-6

The weights of containers being shipped to Ireland are (in thousands of pounds):

(a) What is the range of the weights?

(b) Compute the arithmetic mean weight.

(c) Compute the variance of the weights.

E X E R C I S E S

For DATA FILE, please visit www.mhhe.com/lind16e

For Exercises 35–38, calculate the (a) range, (b) arithmetic mean, (c) variance, and (d) interpret the statistics.

35. There were five customer service representatives on duty at the Electronic Super Store during last weekend’s sale. The numbers of HDTVs these representatives sold are 5, 8, 4, 10, and 3.

36. The Department of Statistics at Western State University offers eight sections of basic statistics. Following are the numbers of students enrolled in these sections: 34, 46, 52, 29, 41, 38, 36, and 28.

37. Dave’s Automatic Door installs automatic garage door openers. The following list indicates the number of minutes needed to install 10 door openers: 28, 32, 24, 46, 44, 40, 54, 38, 32, and 42.

38. All eight companies in the aerospace industry were surveyed as to their return on investment last year. The results are (in percent) 10.6, 12.6, 14.8, 18.2, 12.0, 14.8, 12.2, and 15.6.

39. Ten young adults living in California rated the taste of a newly developed sushi pizza topped with tuna, rice, and kelp on a scale of 1 to 50, with 1 indicating they did not like the taste and 50 that they did. The ratings were:

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=72...

1 of 2 4/9/2016 9:11 AM

Page 73

In a parallel study, 10 young adults in Iowa rated the taste of the same pizza. The ratings were:

As a market researcher, compare the potential markets for sushi pizza.

40. The personnel files of all eight employees at the Pawnee location of Acme Carpet Cleaners Inc. revealed that during the last 6-month period they lost the following number of days due to illness:

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=72...

2 of 2 4/9/2016 9:11 AM

Page 73

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted. All eight employees during the same period at the Chickpee location of Acme Carpets revealed they lost the following number of days due to illness:

As the director of human relations, compare the two locations. What would you recommend?

Population Variance In the previous example, we developed the concept of variance as a measure of dispersion. Similar to the mean, we can calculate the variance of a population or the variance of a sample. The formula to compute the population variance is:

where:

σ2 is the population variance (σ is the lowercase Greek letter sigma). It is read as “sigma squared.”

x is the value of a particular observation in the population.

μ is the arithmetic mean of the population.

N is the number of observations in the population.

The process for computing the variance is implied by the formula.

1. Begin by finding the mean.

2. Find the difference between each observation and the mean, and square that difference.

3. Sum all the squared differences.

4. Divide the sum of the squared differences by the number of items in the population.

So the population variance is the mean of the squared difference between each value and the mean. For populations whose values are near the mean, the variance will be small. For populations whose values are dispersed from the mean, the population variance will be large.

The variance overcomes the weakness of the range by using all the values in the population, whereas the range uses only the maximum and minimum values. We overcome the issue where by squaring the differences. Squaring the differences will always result in nonnegative values. The following is a another example that illustrates the calculation and interpretation of the variance.

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=73...

1 of 2 4/9/2016 9:12 AM

Page 74

E X A M P L E

The number of traffic citations issued last year by month in Beaufort County, South Carolina, is reported below.

Determine the population variance.

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=73...

2 of 2 4/9/2016 9:12 AM

Page 74

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted.

S O L U T I O N

Because we are studying all the citations for a year, the data comprise a population. To determine the population variance, we use formula (3–7). The table below details the calculations.

1. We begin by determining the arithmetic mean of the population. The total number of citations issued for the year is 348, so the mean number issued per month is 29.

2. Next we find the difference between each observation and the mean. This is shown in the third column of the table. Recall that earlier in the chapter (page 54) we indicated that the sum of the differences between each value and the mean is 0. In this example, the sum of the differences between the mean and the number of citations each month is 0.

3. The next step is to square the difference for each month. That is shown in the fourth column of the table. All the squared differences will be positive. Note that squaring a negative value, or multiplying a negative value by itself, always results in a positive value.

4. The squared differences are totaled. The total of the fourth column is 1,488. That is the term .

5. Finally, we divide the squared differences by N, the number of observations in the population.

So, the population variance for the number of citations is 124.

Like the range, the variance can be used to compare the dispersion in two or more sets of observations. For example, the variance for the number of citations issued in Beaufort County was just computed to be 124. If the variance in the number of citations issued in Marlboro County, South Carolina, is 342.9, we conclude that (1) there is less dispersion in the distribution of the number of citations issued in Beaufort County than in Marlboro County (because 124 is less than 342.9) and (2) the number of citations in Beaufort County is more closely clustered around the mean of 29 than for the number of citations issued in Marlboro County. Thus the

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=74...

1 of 2 4/9/2016 9:12 AM

Page 75

mean number of citations issued in Beaufort County is a more representative measure of location than the mean number of citations in Marlboro County.

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=74...

2 of 2 4/9/2016 9:12 AM

Page 75

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted. Population Standard Deviation

When we compute the variance, it is important to understand the unit of measure and what happens when the differences in the numerator are squared. That is, in the previous example, the number of monthly citations is the variable. When we calculate the variance, the unit of measure for the variance is citations squared. Using “squared citations” as a unit of measure is cumbersome.

There is a way out of this difficulty. By taking the square root of the population variance, we can transform it to the same unit of measurement used for the original data. The square root of 124 citations squared is 11.14 citations. The units are now simply citations. The square root of the population variance is the population standard deviation.

3-7

The Philadelphia office of Price Waterhouse Coopers LLP hired five accounting trainees this year. Their monthly starting salaries were $3,536; $3,173; $3,448; $3,121; and $3,622.

(a) Compute the population mean.

(b) Compute the population variance.

(c) Compute the population standard deviation.

(d) The Pittsburgh office hired six trainees. Their mean monthly salary was $3,550, and the standard deviation was $250. Compare the two groups.

E X E R C I S E S

41. Consider these five values a population: 8, 3, 7, 3, and 4.

a. Determine the mean of the population.

b. Determine the variance.

42. Consider these six values a population: 13, 3, 8, 10, 8, and 6.

a. Determine the mean of the population.

b. Determine the variance.

43. The annual report of Dennis Industries cited these primary earnings per common share for the past 5 years: $2.68, $1.03, $2.26, $4.30, and $3.58. If we assume these are population values, what is:

a. The arithmetic mean primary earnings per share of common stock?

b. The variance?

44. Referring to Exercise 43, the annual report of Dennis Industries also gave these returns on stockholder equity for the same 5-year period (in percent): 13.2, 5.0, 10.2, 17.5, and 12.9.

a. What is the arithmetic mean return?

b. What is the variance?

45. Plywood Inc. reported these returns on stockholder equity for the past 5 years: 4.3, 4.9, 7.2, 6.7, and

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=75...

1 of 2 4/9/2016 9:13 AM

Page 76

11.6. Consider these as population values.

a. Compute the range, the arithmetic mean, the variance, and the standard deviation.

b. Compare the return on stockholder equity for Plywood Inc. with that for Dennis Industries cited in Exercise 44.

46. The annual incomes of the five vice presidents of TMV Industries are $125,000; $128,000; $122,000; $133,000; and $140,000. Consider this a population.

a. What is the range?

b. What is the arithmetic mean income?

c. What is the population variance? The standard deviation?

d. The annual incomes of officers of another firm similar to TMV Industries were also studied. The mean was $129,000 and the standard deviation $8,612. Compare the means and dispersions in the two firms.

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=75...

2 of 2 4/9/2016 9:13 AM

Page 76

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted. Sample Variance and Standard Deviation

The formula for the population mean is . We just changed the symbols for the sample mean; that is, . Unfortunately, the conversion from the population variance to the sample variance is not as direct. It requires a change in the denominator. Instead of substituting n (number in the sample) for N (number in the population), the denominator is n – 1. Thus the formula for the sample variance is:

where:

s2 is the sample variance.

x is the value of each observation in the sample.

is the mean of the sample.

n is the number of observations in the sample.

Why is this change made in the denominator? Although the use of n is logical since is used to estimate μ, it

tends to underestimate the population variance, σ2. The use of (n – 1) in the denominator provides the

appropriate correction for this tendency. Because the primary use of sample statistics like s2 is to estimate

population parameters like σ2, (n – 1) is preferred to n in defining the sample variance. We will also use this convention when computing the sample standard deviation.

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=76...

1 of 2 4/9/2016 9:13 AM

Page 77

E X A M P L E

The hourly wages for a sample of part-time employees at Home Depot are $12, $20, $16, $18, and $19. What is the sample variance?

S O L U T I O N

The sample variance is computed by using formula (3–9).

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=76...

2 of 2 4/9/2016 9:13 AM

Page 77

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted. The sample standard deviation is used as an estimator of the population standard deviation. As noted previously, the population standard deviation is the square root of the population variance. Likewise, the sample standard deviation is the square root of the sample variance. The sample standard deviation is determined by:

E X A M P L E

The sample variance in the previous example involving hourly wages was computed to be 10. What is the sample standard deviation?

S O L U T I O N

The sample standard deviation is $3.16, found by . Note again that the sample variance is in terms of dollars squared, but taking the square root of 10 gives us $3.16, which is in the same units (dollars) as the original data.

Software Solution On page 63, we used Excel to determine the mean, median, and mode of profit for the Applewood Auto Group data. You will also note that it lists the sample variance and sample standard deviation. Excel, like most other statistical software, assumes the data are from a sample.

3-8

The years of service for a sample of seven employees at a State Farm Insurance claims office in

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=77...

1 of 2 4/9/2016 9:13 AM

Page 78

Cleveland, Ohio, are 4, 2, 5, 4, 5, 2, and 6. What is the sample variance? Compute the sample standard deviation.

E X E R C I S E S

For DATA FILE, please visit www.mhhe.com/lind16e

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=77...

2 of 2 4/9/2016 9:13 AM

Page 78

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted. For Exercises 47–52, do the following:

a. Compute the sample variance.

b. Determine the sample standard deviation.

47. Consider these values a sample: 7, 2, 6, 2, and 3.

48. The following five values are a sample: 11, 6, 10, 6, and 7.

49. Dave’s Automatic Door, referred to in Exercise 37, installs automatic garage door openers. Based on a sample, following are the times, in minutes, required to install 10 door openers: 28, 32, 24, 46, 44, 40, 54, 38, 32, and 42.

50. The sample of eight companies in the aerospace industry, referred to in Exercise 38, was surveyed as to their return on investment last year. The results are 10.6, 12.6, 14.8, 18.2, 12.0, 14.8, 12.2, and 15.6.

51. The Houston, Texas, Motel Owner Association conducted a survey regarding weekday motel rates in the area. Listed below is the room rate for business-class guests for a sample of 10 motels.

52. A consumer watchdog organization is concerned about credit card debt. A survey of 10 young adults with credit card debt of more than $2,000 showed they paid an average of just over $100 per month against their balances. Listed below are the amounts each young adult paid last month.

LO3-5

Explain and apply Chebyshev’s theorem and the Empirical Rule.

INTERPRETATION AND USES OF THE STANDARD DEVIATION The standard deviation is commonly used as a measure to compare the spread in two or more sets of observations. For example, the standard deviation of the biweekly amounts invested in the Dupree Paint Company profit-sharing plan is computed to be $7.51. Suppose these employees are located in Georgia. If the standard deviation for a group of employees in Texas is $10.47, and the means are about the same, it indicates that the amounts invested by the Georgia employees are not dispersed as much as those in Texas (because $7.51

Most colleges report the “average class size.” This information can be misleading because average class size can be found in several ways. If we find the number of students in each class at a particular university, the result is the mean number of students per class. If we compile a list of the class sizes for each student and find the mean class size, we might find the mean to be quite different. One school found the mean number of students in each of its 747 classes to be 40. But when it found the mean from a list of the class sizes of each student, it was 147. Why the disparity? Because there are few students in the small classes and a larger number of students in the larger classes, which has the effect of increasing the mean class size

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=78...

1 of 2 4/9/2016 9:14 AM

Page 79

when it is calculated this way. A school could reduce this mean class size for each student by reducing the number of students in each class. That is, cut out the large freshman lecture classes.

Chebyshev’s Theorem We have stressed that a small standard deviation for a set of values indicates that these values are located close to the mean. Conversely, a large standard deviation reveals that the observations are widely scattered about the mean. The Russian mathematician P. L. Chebyshev (1821–1894) developed a theorem that allows us to determine the minimum proportion of the values that lie within a specified number of standard deviations of the mean. For example, according to Chebyshev’s theorem, at least three out of every four, or 75%, of the values must lie between the mean plus two standard deviations and the mean minus two standard deviations. This relationship applies regardless of the shape of the distribution. Further, at least eight of nine values, or 88.9%, will lie between plus three standard deviations and minus three standard deviations of the mean. At least 24 of 25 values, or 96%, will lie between plus and minus five standard deviations of the mean.

Chebyshev’s theorem states:

CHEBYSHEV’S THEOREM For any set of observations (sample or population), the proportion of the

values that lie within k standard deviations of the mean is at least 1 − 1/k2, where k is any value greater than 1.

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=78...

2 of 2 4/9/2016 9:14 AM

Page 79

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted.

E X A M P L E

The arithmetic mean biweekly amount contributed by the Dupree Paint employees to the company’s profit- sharing plan is $51.54, and the standard deviation is $7.51. At least what percent of the contributions lie within plus 3.5 standard deviations and minus 3.5 standard deviations of the mean?

S O L U T I O N

About 92%, found by

The Empirical Rule Chebyshev’s theorem applies to any set of values; that is, the distribution of values can have any shape. However, for a symmetrical, bell-shaped distribution such as the one in Chart 3–7, we can be more precise in explaining the dispersion about the mean. These relationships involving the standard deviation and the mean are described by the Empirical Rule, sometimes called the Normal Rule.

EMPIRICAL RULE For a symmetrical, bell-shaped frequency distribution, approximately 68% of the observations will lie within plus and minus one standard deviation of the mean; about 95% of the observations will lie within plus and minus two standard deviations of the mean; and practically all (99.7%) will lie within plus and minus three standard deviations of the mean.

These relationships are portrayed graphically in Chart 3–7 for a bell-shaped distribution with a mean of 100 and a standard deviation of 10.

CHART 3–7 A Symmetrical, Bell-Shaped Curve Showing the Relationships between the Standard Deviation and the Percentage of Observations

Applying the Empirical Rule, if a distribution is symmetrical and bell-shaped, practically all of the

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=79...

1 of 2 4/9/2016 9:14 AM

Page 80

observations lie between the mean plus and minus three standard deviations. Thus, if = 100 and s = 10, practically all the observations lie between 100 1 3(10) and 100 − 3(10), or 70 and 130. The estimated range is therefore 60, found by 130 − 70.

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=79...

2 of 2 4/9/2016 9:14 AM

Page 80

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted. Conversely, if we know that the range is 60 and the distribution is bell-shaped, we can approximate the standard deviation by dividing the range by 6. For this illustration: range ÷ 6 = 60 ÷ 6 = 10, the standard deviation.

E X A M P L E

A sample of the rental rates at University Park Apartments approximates a symmetrical, bell-shaped distribution. The sample mean is $500; the standard deviation is $20. Using the Empirical Rule, answer these questions:

1. About 68% of the monthly rentals are between what two amounts?

2. About 95% of the monthly rentals are between what two amounts?

3. Almost all of the monthly rentals are between what two amounts?

S O L U T I O N

1. About 68% are between $480 and $520, found by .

2. About 95% are between $460 and $540, found by .

3. Almost all (99.7%) are between $440 and $560, found by .

3-9

The Pitney Pipe Company is one of several domestic manufacturers of PVC pipe. The quality control department sampled 600 10-foot lengths. At a point 1 foot from the end of the pipe, they measured the outside diameter. The mean was 14.0 inches and the standard deviation 0.1 inch.

(a) If the shape of the distribution is not known, at least what percent of the observations will be between 13.85 inches and 14.15 inches?

(b) If we assume that the distribution of diameters is symmetrical and bell-shaped, about 95% of the observations will be between what two values?

E X E R C I S E S

53. According to Chebyshev’s theorem, at least what percent of any set of observations will be within 1.8 standard deviations of the mean?

54. The mean income of a group of sample observations is $500; the standard deviation is $40. According to Chebyshev’s theorem, at least what percent of the incomes will lie between $400 and $600?

55. The distribution of the weights of a sample of 1,400 cargo containers is symmetric and bell-shaped. According to the Empirical Rule, what percent of the weights will lie:

a. Between ?

b. Between ?

56. The following graph portrays the distribution of the number of Biggie-sized soft drinks sold at a nearby Wendy’s for the last 141 days. The mean number of drinks sold per day is 91.9 and the standard deviation is 4.67.

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=80...

1 of 2 4/9/2016 9:15 AM

Page 81

If we use the Empirical Rule, sales will be between what two values on 68% of the days? Sales will be between what two values on 95% of the days?

LO3-6

Compute the mean and standard deviation of grouped data.

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=80...

2 of 2 4/9/2016 9:15 AM

Page 81

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted. THE MEAN AND STANDARD DEVIATION OF GROUPED DATA

In most instances, measures of location, such as the mean, and measures of dispersion, such as the standard deviation, are determined by using the individual values. Statistical software packages make it easy to calculate these values, even for large data sets. However, sometimes we are given only the frequency distribution and wish to estimate the mean or standard deviation. In the following discussion, we show how we can estimate the mean and standard deviation from data organized into a frequency distribution. We should stress that a mean or a standard deviation from grouped data is an estimate of the corresponding actual values.

Buster Posey of the San Francisco Giants had the highest batting average, at .336, during the 2012 Major League Baseball season. Tony Gwynn hit. 394 in the strike-shortened season of 1994, and Ted Williams hit. 406 in 1941. No one has hit over. 400 since 1941. The mean batting average has remained constant at about. 260 for more than 100 years, but the standard deviation declined from. 049 to. 031. This indicates less dispersion in the batting averages today and helps explain the lack of any. 400 hitters in recent times.

Arithmetic Mean of Grouped Data To approximate the arithmetic mean of data organized into a frequency distribution, we begin by assuming the observations in each class are represented by the midpoint of the class. The mean of a sample of data organized in a frequency distribution is computed by:

where:

is the designation for the sample mean.

M is the midpoint of each class.

f is the frequency in each class.

fM is the frequency in each class times the midpoint of the class.

is the sum of these products.

n is the total number of frequencies.

E X A M P L E

The computations for the arithmetic mean of data grouped into a frequency distribution will be shown based on the Applewood Auto Group profit data. Recall in Chapter 2, in Table 2–7 on page 30, we constructed a frequency distribution for the vehicle profit. The information is repeated below. Determine the arithmetic mean profit per vehicle.

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=81...

1 of 2 4/9/2016 9:15 AM

Page 82

S O L U T I O N

The mean vehicle selling price can be estimated from data grouped into a frequency distribution. To find the estimated mean, assume the midpoint of each class is representative of the data values in that class. Recall that the midpoint of a class is halfway between the lower class limits of two consecutive classes. To

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=81...

2 of 2 4/9/2016 9:15 AM

Page 82

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted. find the midpoint of a particular class, we add the lower limits of two consecutive classes and divide by 2. Hence, the midpoint of the first class is $400, found by ($200 + $600)/2. We assume the value of $400 is representative of the eight values in that class. To put it another way, we assume the sum of the eight values in this class is $3,200, found by 8($400). We continue the process of multiplying the class midpoint by the class frequency for each class and then sum these products. The results are summarized in Table 3–1.

TABLE 3–1 Profit on 180 Vehicles Sold Last Month at Applewood Auto Group

Solving for the arithmetic mean using formula (3–11), we get:

We conclude that the mean profit per vehicle is about $1,851.

Standard Deviation of Grouped Data To calculate the standard deviation of data grouped into a frequency distribution, we need to adjust formula (3–10) slightly. We weight each of the squared differences by the number of frequencies in each class. The formula is:

where:

s is the symbol for the sample standard deviation.

M is the midpoint of the class.

f is the class frequency.

n is the number of observations in the sample.

is the designation for the sample mean.

E X A M P L E

Refer to the frequency distribution for the Applewood Auto Group profit data reported in Table 3–1. Compute the standard deviation of the vehicle selling prices.

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=82...

1 of 2 4/9/2016 9:15 AM

Page 83

S O L U T I O N

Following the same practice used earlier for computing the mean of data grouped into a frequency distribution, f is the class frequency, M the class midpoint, and n the number of observations.

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=82...

2 of 2 4/9/2016 9:15 AM

Page 83

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted.

To find the standard deviation:

Step 1: Subtract the mean from the class midpoint. That is, find (M − ) = ($400 − $1,851 = −$1,451) for the first class, for the second class ($800 − $1,851 = −$1,051), and so on.

Step 2: Square the difference between the class midpoint and the mean. For the first class, it would be ($400 − $1,851)2 = 2,105,401, for the second class ($800 − $1,851)2 = 1,104,601, and so on.

Step 3: Multiply the squared difference between the class midpoint and the mean by the class frequency. For the first class, the value is 8($400 = $1,851)2 = 16,843,208; for the second, 11($800 − $1,851)2 = 12,150,611, and so on.

Step 4: Sum the . The total is 76,169,920. To find the standard deviation, we insert these values in formula (3–12).

The mean and the standard deviation calculated from the data grouped into a frequency distribution are usually close to the values calculated from raw data. The grouped data result in some loss of information. For the vehicle profit example, the mean profit reported in the Excel output on page 63 is $1,843.17 and the standard deviation is $643.63. The respective values estimated from data grouped into a frequency distribution are $1,851.11 and $652.33. The difference in the means is $7.94, or about 0.4%. The standard deviations differ by $8.70, or 1.4%. Based on the percentage difference, the estimates are very close to the actual values.

3-10

The net incomes of a sample of large importers of antiques were organized into the following table:

(a) What is the table called?

(b) Based on the distribution, what is the estimate of the arithmetic mean net income?

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=83...

1 of 2 4/9/2016 9:16 AM

Page 84

(c) Based on the distribution, what is the estimate of the standard deviation?

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=83...

2 of 2 4/9/2016 9:16 AM

Page 84

Page 85

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted. EXERCISES

57. When we compute the mean of a frequency distribution, why do we refer to this as an estimated mean?

58. Estimate the mean and the standard deviation of the following frequency distribution.

59. Estimate the mean and the standard deviation of the following frequency distribution.

60. SCCoast, an Internet provider in the Southeast, developed the following frequency distribution on the age of Internet users. Estimate the mean and the standard deviation.

61. The IRS was interested in the number of individual tax forms prepared by small accounting firms. The IRS randomly sampled 50 public accounting firms with 10 or fewer employees in the Dallas–Fort Worth area. The following frequency table reports the results of the study. Estimate the mean and the standard deviation.

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=84...

1 of 1 4/9/2016 9:17 AM

Page 85

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted. 62. Advertising expenses are a significant component of the cost of goods sold. Listed below is a frequency distribution showing the advertising expenditures for 60 manufacturing companies located in the Southwest. Estimate the mean and the standard deviation of advertising expenses.

ETHICS AND REPORTING RESULTS In Chapter 1, we discussed the ethical and unbiased reporting of statistical results. While you are learning about how to organize, summarize, and interpret data using statistics, it is also important to understand statistics so that you can be an intelligent consumer of information.

In this chapter, we learned how to compute numerical descriptive statistics. Specifically, we showed how to compute and interpret measures of location for a data set: the mean, median, and mode. We also discussed the advantages and disadvantages for each statistic. For example, if a real estate developer tells a client that the average home in a particular subdivision sold for $150,000, we assume that $150,000 is a representative selling price for all the homes. But suppose that the client also asks what the median sales price is, and the median is $60,000. Why was the developer only reporting the mean price? This information is extremely important to a person’s decision making when buying a home. Knowing the advantages and disadvantages of the mean, median, and mode is important as we report statistics and as we use statistical information to make decisions.

We also learned how to compute measures of dispersion: range, variance, and standard deviation. Each of these statistics also has advantages and disadvantages. Remember that the range provides information about the overall spread of a distribution. However, it does not provide any information about how the data are clustered or concentrated around the center of the distribution. As we learn more about statistics, we need to remember that when we use statistics we must maintain an independent and principled point of view. Any statistical report requires objective and honest communication of the results.

C H A P T E R S U M M A R Y

I. A measure of location is a value used to describe the center of a set of data.

A. The arithmetic mean is the most widely reported measure of location.

1. It is calculated by adding the values of the observations and dividing by the total number of observations.

a. The formula for a population mean of ungrouped or raw data is

b. The formula for the mean of a sample is

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=85...

1 of 2 4/9/2016 9:18 AM

Page 86

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=85...

2 of 2 4/9/2016 9:18 AM

Page 86

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted. c. The formula for the sample mean of data in a frequency distribution is

2. The major characteristics of the arithmetic mean are:

a. At least the interval scale of measurement is required.

b. All the data values are used in the calculation.

c. A set of data has only one mean. That is, it is unique.

d. The sum of the deviations from the mean equals 0.

B. The median is the value in the middle of a set of ordered data.

1. To find the median, sort the observations from minimum to maximum and identify the middle value.

2. The major characteristics of the median are:

a. At least the ordinal scale of measurement is required.

b. It is not influenced by extreme values.

c. Fifty percent of the observations are larger than the median.

d. It is unique to a set of data.

C. The mode is the value that occurs most often in a set of data.

1. The mode can be found for nominal-level data.

2. A set of data can have more than one mode.

D. The weighted mean is found by multiplying each observation by its corresponding weight.

1. The formula for determining the weighted mean is

E. The geometric mean is the nth root of the product of n positive values.

1. The formula for the geometric mean is

2. The geometric mean is also used to find the rate of change from one period to another.

3. The geometric mean is always equal to or less than the arithmetic mean.

II. The dispersion is the variation or spread in a set of data.

A. The range is the difference between the maximum and minimum values in a set of data.

1. The formula for the range is

2. The major characteristics of the range are:

a. Only two values are used in its calculation.

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=86...

1 of 2 4/9/2016 9:18 AM

Page 87

b. It is influenced by extreme values.

c. It is easy to compute and to understand.

B. The variance is the mean of the squared deviations from the arithmetic mean.

1. The formula for the population variance is

2.The formula for the sample variance is

3. The major characteristics of the variance are:

a. All observations are used in the calculation.

b. The units are somewhat difficult to work with; they are the original units squared.

C. The standard deviation is the square root of the variance.

1. The major characteristics of the standard deviation are:

a. It is in the same units as the original data.

b. It is the square root of the average squared distance from the mean.

c. It cannot be negative.

d. It is the most widely reported measure of dispersion.

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=86...

2 of 2 4/9/2016 9:18 AM

Page 87

PRINTED BY: [email protected]. Printing is for personal, private use only. No part of this book may be reproduced or transmitted without publisher's prior permission. Violators will be prosecuted. 2. The formula for the sample standard deviation is

3. The formula for the standard deviation of grouped data is

III. We use the standard deviation to describe a frequency distribution by applying Chebyshev’s theorem or the Empirical Rule.

A. Chebyshev’s theorem states that regardless of the shape of the distribution, at least 1 − 1/k2 of the observations will be within k standard deviations of the mean, where k is greater than 1.

B. The Empirical Rule states that for a bell-shaped distribution about 68% of the values will be within one standard deviation of the mean, 95% within two, and virtually all within three.

P R O N U N C I A T I O N K E Y

SYMBOL MEANING PRONUNCIATION

μ Population mean mu

Σ Operation of adding sigma

Σx Adding a group of values sigma x

Sample mean x bar

w Weighted mean x bar sub w

GM Geometric mean GM

ΣfM Adding the product of the frequencies and the class midpoints

sigma f M

σ2 Population variance sigma squared

σ Population standard deviation sigma

C H A P T E R E X E R C I S E S

For DATA FILE, please visit www.mhhe.com/lind16e

63. The accounting firm of Crawford and Associates has five senior partners. Yesterday the senior partners saw six, four, three, seven, and five clients, respectively.

a. Compute the mean and median number of clients seen by the partners.

b. Is the mean a sample mean or a population mean?

c. Verify that .

64. Owens Orchards sells apples in a large bag by weight. A sample of seven bags contained the following

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=87...

1 of 2 4/9/2016 9:18 AM

Page 88

numbers of apples: 23, 19, 26, 17, 21, 24, 22.

a. Compute the mean and median number of apples in a bag.

b. Verify that .

65. A sample of households that subscribe to United Bell Phone Company for landline phone service revealed the following number of calls received per household last week. Determine the mean and the median number of calls received.

66. The Citizens Banking Company is studying the number of times the ATM located in a Loblaws Supermarket at the foot of Market Street is used per day. Following are the number of times the machine was used daily over each of the last 30 days. Determine the mean number of times the machine was used per day.

https://jigsaw.vitalsource.com/api/v0/books/1259698246/print?from=87...

2 of 2 4/9/2016 9:18 AM

  • pg 50
  • pg 51
  • pg 52
  • pg 53
  • pg 54
  • pg 55
  • pg 56
  • pg 57
  • pg 58
  • pg 59
  • pg 60
  • pg 61
  • pg 62
  • pg 63
  • pg 64
  • pg 65
  • pg 66
  • pg 67
  • pg 68
  • pg 69
  • pg 70
  • pg 71
  • pg 72
  • pg 73
  • pg 74
  • pg 75
  • pg 76
  • pg 77
  • pg 78
  • pg 79
  • pg 80
  • pg 81
  • pg 82
  • pg 83
  • pg 84
  • pg 85
  • pg 86
  • pg 88