Statistics Problems - 10

clw0dq8
Chapter2.pdf

LEARNING OBJECTIVES When you have completed this chapter, you will be able to:

LO2-1 Summarize qualitative variables with frequency and relative frequency tables.

LO2-2 Display a frequency table using a bar or pie chart.

LO2-3 Summarize quantitative variables with frequency and relative frequency distributions.

LO2-4 Display a frequency distribution using a histogram or frequency polygon.

MERRILL LYNCH recently completed a study of online investment portfolios for a sample of clients. For the 70 participants in the study, organize these data into a frequency distribution. (See Exercise 43 and LO2-3.)

Describing Data: FREQUENCY TABLES, FREQUENCY DISTRIBUTIONS,

AND GRAPHIC PRESENTATION2

© rido/123RF

DESCRIBING DATA: FREQUENCY TABLES, FREQUENCY DISTRIBUTIONS, AND GRAPHIC PRESENTATION 19

INTRODUCTION The United States automobile retailing industry is highly competitive. It is dominated by megadealerships that own and operate 50 or more franchises, employ over 10,000 people, and generate several billion dollars in annual sales. Many of the top dealerships

are publicly owned with shares traded on the New York Stock Exchange or NASDAQ. In 2014, the largest megadealership was AutoNation (ticker symbol AN), followed by Penske Auto Group (PAG), Group 1 Automotive, Inc. (ticker symbol GPI), and the privately owned Van Tuyl Group.

These large corporations use statistics and analytics to summarize and analyze data and information to support their decisions. As an ex- ample, we will look at the Applewood Auto group. It owns four dealer- ships and sells a wide range of vehicles. These include the popular Korean brands Kia and Hyundai, BMW and Volvo sedans and luxury SUVs, and a full line of Ford and Chevrolet cars and trucks.

Ms. Kathryn Ball is a member of the senior management team at Applewood Auto Group, which has its corporate offices adjacent to Kane Motors. She is responsible for tracking and analyzing vehicle sales and

the profitability of those vehicles. Kathryn would like to summarize the profit earned on the vehicles sold with tables, charts, and graphs that she would review monthly. She wants to know the profit per vehicle sold, as well as the lowest and highest amount of profit. She is also interested in describing the demographics of the buyers. What are their ages? How many vehicles have they previously purchased from one of the Apple- wood dealerships? What type of vehicle did they purchase?

The Applewood Auto Group operates four dealerships:

• Tionesta Ford Lincoln sells Ford and Lincoln cars and trucks. • Olean Automotive Inc. has the Nissan franchise as well as the General Motors

brands of Chevrolet, Cadillac, and GMC Trucks. • Sheffield Motors Inc. sells Buick, GMC trucks, Hyundai, and Kia. • Kane Motors offers the Chrysler, Dodge, and Jeep line as well as BMW and Volvo.

Every month, Ms. Ball collects data from each of the four dealerships and enters them into an Excel spreadsheet. Last month the Applewood Auto Group sold 180 vehicles at the four dealerships. A copy of the first few observations appears to the left. The variables collected include:

• Age—the age of the buyer at the time of the purchase. • Profit—the amount earned by the dealership on the sale of each

vehicle. • Location—the dealership where the vehicle was purchased. • Vehicle type—SUV, sedan, compact, hybrid, or truck. • Previous—the number of vehicles previously purchased at any of the

four Applewood dealerships by the consumer.

The entire data set is available at the McGraw-Hill website (www.mhhe .com/lind17e) and in Appendix A.4 at the end of the text.

© Justin Sullivan/Getty Images

CONSTRUCTING FREQUENCY TABLES Recall from Chapter 1 that techniques used to describe a set of data are called descrip- tive statistics. Descriptive statistics organize data to show the general pattern of the data, to identify where values tend to concentrate, and to expose extreme or unusual data values. The first technique we discuss is a frequency table.

LO2-1 Summarize qualitative variables with frequency and relative frequency tables.

FREQUENCY TABLE A grouping of qualitative data into mutually exclusive and collectively exhaustive classes showing the number of observations in each class.

20 CHAPTER 2

In Chapter 1, we distinguished between qualitative and quantitative variables. To review, a qualitative variable is nonnumeric, that is, it can only be classified into distinct categories. Examples of qualitative data include political affiliation (Republican, Demo- crat, Independent, or other), state of birth (Alabama, . . . , Wyoming), and method of payment for a purchase at Barnes & Noble (cash, digital wallet, debit, or credit). On the other hand, quantitative variables are numerical in nature. Examples of quantitative data relating to college students include the price of their textbooks, their age, and the num- ber of credit hours they are registered for this semester.

In the Applewood Auto Group data set, there are five variables for each vehicle sale: age of the buyer, amount of profit, dealer that made the sale, type of vehicle sold, and number of previous purchases by the buyer. The dealer and the type of vehicle are qualitative variables. The amount of profit, the age of the buyer, and the number of pre- vious purchases are quantitative variables.

Suppose Ms. Ball wants to summarize last month’s sales by location. The first step is to sort the vehicles sold last month according to their location and then tally, or count, the number sold at each location of the four locations: Tionesta, Olean, Sheffield, or Kane. The four locations are used to develop a frequency table with four mutually exclusive (distinctive) classes. Mutually exclu- sive classes means that a particular vehicle can be assigned to only one class. In addition, the frequency table must be collectively exhaustive. That is every vehi- cle sold last month is accounted for in the table. If every vehicle is included in the frequency table, the table will be collectively exhaustive and the total number of vehicles will be 180. How do we obtain these counts? Excel provides a tool called a Pivot Table that will quickly and accurately establish the four classes and do the counting. The Excel results follow in Table 2–1. The table shows a total of 180 vehicles and, of the 180 vehicles, 52 were sold at Kane Motors. © Steve Cole/Getty Images RF

TABLE 2–1 Frequency Table for Vehicles Sold Last Month at Applewood Auto Group by Location

Location Number of Cars

Kane 52 Olean 40 Sheffield 45 Tionesta 43

Total 180

Relative Class Frequencies You can convert class frequencies to relative class frequencies to show the fraction of the total number of observations in each class. A relative frequency captures the relationship between a class frequency and the total number of observations. In the vehicle sales ex- ample, we may want to know the percentage of total cars sold at each of the four locations. To convert a frequency table to a relative frequency table, each of the class frequencies is divided by the total number of observations. Again, this is easily accomplished using Excel. The fraction of vehicles sold last month at the Kane location is 0.289, found by 52 divided by 180. The relative frequency for each location is shown in Table 2–2.

TABLE 2–2 Relative Frequency Table of Vehicles Sold by Location Last Month at Applewood Auto Group

Location Number of Cars Relative Frequency Found by

Kane 52 .289 52/180 Olean 40 .222 40/180 Sheffield 45 .250 45/180 Tionesta 43 .239 43/180

Total 180 1.000

DESCRIBING DATA: FREQUENCY TABLES, FREQUENCY DISTRIBUTIONS, AND GRAPHIC PRESENTATION 21

GRAPHIC PRESENTATION OF QUALITATIVE DATA The most common graphic form to present a qualitative variable is a bar chart. In most cases, the horizontal axis shows the variable of interest. The vertical axis shows the frequency or fraction of each of the possible outcomes. A distinguishing feature of a bar chart is there is distance or a gap between the bars. That is, because the variable of in- terest is qualitative, the bars are not adjacent to each other. Thus, a bar chart graphically describes a frequency table using a series of uniformly wide rectangles, where the height of each rectangle is the class frequency.

LO2-2 Display a frequency table using a bar or pie chart.

BAR CHART A graph that shows qualitative classes on the horizontal axis and the class frequencies on the vertical axis. The class frequencies are proportional to the heights of the bars.

PIE CHART A chart that shows the proportion or percentage that each class represents of the total number of frequencies.

We use the Applewood Auto Group data as an example (Chart 2–1). The variables of interest are the location where the vehicle was sold and the number of vehicles sold at each location. We label the horizontal axis with the four locations and scale the verti- cal axis with the number sold. The variable location is of nominal scale, so the order of the locations on the horizontal axis does not matter. In Chart 2–1, the locations are listed alphabetically. The locations could also be in order of decreasing or increasing frequencies.

The height of the bars, or rectangles, corresponds to the number of vehicles at each location. There were 52 vehicles sold last month at the Kane location, so the height of the Kane bar is 52; the height of the bar for the Olean location is 40. 

N um

be r

of V

eh ic

le s

So ld

50

40

30

20

10

0 Kane Olean

Location

Shef�eld Tionesta

CHART 2–1 Number of Vehicles Sold by Location

Another useful type of chart for depicting qualitative information is a pie chart.

We explain the details of constructing a pie chart using the information in Table 2–3, which shows the frequency and percent of cars sold by the Applewood Auto Group for each vehicle type.

22 CHAPTER 2

The first step to develop a pie chart is to mark the percentages 0, 5, 10, 15, and so on evenly around the circumference of a circle (see Chart 2–2). To plot the 40% of total sales represented by sedans, draw a line from the center of the circle to 0 and another line from the center of the circle to 40%. The area in this “slice” represents the number of sedans sold as a percentage of the total sales. Next, add the SUV’s percentage of total sales, 30%, to the sedan’s percentage of total sales, 40%. The result is 70%. Draw a line from the center of the circle to 70%, so the area between 40 and 70 shows the sales of SUVs as a percentage of total sales. Continuing, add the 15% of total sales for compact vehicles, which gives us a total of 85%. Draw a line from the center of the circle to 85, so the “slice” between 70% and 85% represents the number of compact vehicles sold as a percentage of the total sales. The remaining 10% for truck sales and 5% for hybrid sales are added to the chart using the same method.

Vehicle Type Number Sold Percent Sold

Sedan 72 40 SUV 54 30 Compact 27 15 Truck 18 10 Hybrid 9 5

Total 180 100

TABLE 2–3 Vehicle Sales by Type at Applewood Auto Group

25%

50%

70%

85%

95% 0%

40%

75%

Hybrid

Truck

Sedan

SUV

Compact

CHART 2–2 Pie Chart of Vehicles by Type

Because each slice of the pie represents the relative frequency of each vehicle type as a percentage of the total sales, we can easily compare them:

• The largest percentage of sales is for sedans. • Sedans and SUVs together account for 70% of vehicle sales. • Hybrids account for 5% of vehicle sales, in spite of being on the market for only a

few years.

We can use Excel software to quickly count the number of cars for each vehicle type and create the frequency table, bar chart, and pie chart shown in the following summary. The Excel tool is called a Pivot Table. The instructions to produce these de- scriptive statistics and charts are given in Appendix C.

DESCRIBING DATA: FREQUENCY TABLES, FREQUENCY DISTRIBUTIONS, AND GRAPHIC PRESENTATION 23

Pie and bar charts both serve to illustrate frequency and relative frequency ta- bles. When is a pie chart preferred to a bar chart? In most cases, pie charts are used to show and compare the relative differences in the percentage of observations for each value or class of a qualitative variable. Bar charts are preferred when the goal is to compare the number or frequency of observations for each value or class of a qualitative variable. The following Example/Solution shows another application of bar and pie charts.

E X A M P L E

SkiLodges.com is test marketing its new website and is interested in how easy its website design is to navigate. It randomly selected 200 regular Internet users and asked them to perform a search task on the website. Each person was asked to rate the relative ease of navigation as poor, good, excellent, or awesome. The re- sults are shown in the following table:

Awesome 102 Excellent 58 Good 30 Poor 10

1. What type of measurement scale is used for ease of navigation? 2. Draw a bar chart for the survey results. 3. Draw a pie chart for the survey results.

S O L U T I O N

The data are measured on an ordinal scale. That is, the scale is ranked in relative ease of navigation when moving from “awesome” to “poor.” The interval between each rating is unknown so it is impossible, for example, to conclude that a rating of good is twice the value of a poor rating.

We can use a bar chart to graph the data. The vertical scale shows the relative frequency and the horizontal scale shows the values of the ease-of- navigation variable.

24 CHAPTER 2

A pie chart can also be used to graph these data. The pie chart emphasizes that more than half of the respondents rate the relative ease of using the website awesome.

R el

at iv

e Fr

eq ue

nc y

%

60

50

40

30

20

10

0 PoorGoodExcellentAwesome

Ease of Navigation of SkiLodges.com website

Ease of Navigation

Beverage Number

Cola-Plus 40 Coca-Cola 25 Pepsi 20 Lemon-Lime 15

Total 100

The answers are in Appendix E.

DeCenzo Specialty Food and Beverage Company has been serving a cola drink with an additional flavoring, Cola-Plus, that is very popular among its customers. The company is interested in customer preferences for Cola-Plus versus Coca-Cola, Pepsi, and a lemon-lime beverage. They ask 100 randomly sampled customers to take a taste test and select the beverage they prefer most. The results are shown in the following table:

S E L F - R E V I E W 2–1

Poor 5%

Ease of Navigation of SkiLodges.com website

Good 15%

Awesome 51% Excellent

29%

DESCRIBING DATA: FREQUENCY TABLES, FREQUENCY DISTRIBUTIONS, AND GRAPHIC PRESENTATION 25

(a) Is the data qualitative or quantitative? Why? (b) What is the table called? What does it show? (c) Develop a bar chart to depict the information. (d) Develop a pie chart using the relative frequencies.

The answers to the odd-numbered exercises are at the end of the book in Appendix D.

1. A pie chart shows the relative market share of cola products. The “slice” for Pepsi- Cola has a central angle of 90 degrees. What is its market share?

2. In a marketing study, 100 consumers were asked to select the best digital music player from the iPod, the iRiver, and the Magic Star MP3. To summarize the con- sumer responses with a frequency table, how many classes would the frequency table have?

3. A total of 1,000 residents in Minnesota were asked which season they preferred. One hundred liked winter best, 300 liked spring, 400 liked summer, and 200 liked fall. Develop a frequency table and a relative frequency table to summarize this information.

4. Two thousand frequent business travelers are asked which midwestern city they prefer: Indianapolis, Saint Louis, Chicago, or Milwaukee. One hundred liked India- napolis best, 450 liked Saint Louis, 1,300 liked Chicago, and the remainder pre- ferred Milwaukee. Develop a frequency table and a relative frequency table to summarize this information.

5. Wellstone Inc. produces and markets replacement covers for cell phones in five different colors: bright white, metallic black, magnetic lime, tangerine orange, and fusion red. To estimate the demand for each color, the company set up a kiosk in the Mall of America for several hours and asked randomly selected people which cover color was their favorite. The results follow:

E X E R C I S E S

Bright white 130 Metallic black 104 Magnetic lime 325 Tangerine orange 455 Fusion red 286

a. What is the table called? b. Draw a bar chart for the table. c. Draw a pie chart. d. If Wellstone Inc. plans to produce 1 million cell phone covers, how many of

each color should it produce? 6. A small business consultant is investigating the performance of several companies.

The fourth-quarter sales for last year (in thousands of dollars) for the selected com- panies were:

Fourth-Quarter Sales Company ($ thousands)

Hoden Building Products $ 1,645.2 J & R Printing Inc. 4,757.0 Long Bay Concrete Construction 8,913.0 Mancell Electric and Plumbing 627.1 Maxwell Heating and Air Conditioning 24,612.0 Mizelle Roofing & Sheet Metals 191.9

The consultant wants to include a chart in his report comparing the sales of the six companies. Use a bar chart to compare the fourth-quarter sales of these corpora- tions and write a brief report summarizing the bar chart.

26 CHAPTER 2

CONSTRUCTING FREQUENCY DISTRIBUTIONS In Chapter 1 and earlier in this chapter, we distinguished between qualitative and quantitative data. In the previous section, using the Applewood Automotive Group data, we summarized two qualitative variables: the location of the sale and the type of vehicle sold. We created frequency and relative frequency tables and depicted the results in bar and pie charts.

The Applewood Auto Group data also includes several quantitative variables: the age of the buyer, the profit earned on the sale of the vehicle, and the number of previ- ous purchases. Suppose Ms. Ball wants to summarize last month’s sales by profit earned for each vehicle. We can describe profit using a frequency distribution.

LO2-3 Summarize quantitative variables with frequency and relative frequency distributions.

FREQUENCY DISTRIBUTION A grouping of quantitative data into mutually exclusive and collectively exhaustive classes showing the number of observations in each class.

How do we develop a frequency distribution? The following example shows the steps to construct a frequency distribution. Remember, our goal is to construct tables, charts, and graphs that will quickly summarize the data by showing the location, extreme values, and shape of the data’s distribution.

TABLE 2–4 Profit on Vehicles Sold Last Month by the Applewood Auto Group Maximum

Minimum

$1,387 $2,148 $2,201 $ 963 $ 820 $2,230 $3,043 $2,584 $2,370 1,754 2,207 996 1,298 1,266 2,341 1,059 2,666 2,637 1,817 2,252 2,813 1,410 1,741 3,292 1,674 2,991 1,426 1,040 1,428 323 1,553 1,772 1,108 1,807 934 2,944 1,273 1,889 352 1,648 1,932 1,295 2,056 2,063 2,147 1,529 1,166 482 2,071 2,350 1,344 2,236 2,083 1,973 3,082 1,320 1,144 2,116 2,422 1,906 2,928 2,856 2,502 1,951 2,265 1,485 1,500 2,446 1,952 1,269 2,989 783 2,692 1,323 1,509 1,549 369 2,070 1,717 910 1,538 1,206 1,760 1,638 2,348 978 2,454 1,797 1,536 2,339 1,342 1,919 1,961 2,498 1,238 1,606 1,955 1,957 2,700 443 2,357 2,127 294 1,818 1,680 2,199 2,240 2,222 754 2,866 2,430 1,115 1,824 1,827 2,482 2,695 2,597 1,621 732 1,704 1,124 1,907 1,915 2,701 1,325 2,742 870 1,464 1,876 1,532 1,938 2,084 3,210 2,250 1,837 1,174 1,626 2,010 1,688 1,940 2,639 377 2,279 2,842 1,412 1,762 2,165 1,822 2,197 842 1,220 2,626 2,434 1,809 1,915 2,231 1,897 2,646 1,963 1,401 1,501 1,640 2,415 2,119 2,389 2,445 1,461 2,059 2,175 1,752 1,821 1,546 1,766 335 2,886 1,731 2,338 1,118 2,058 2,487

S O L U T I O N

To begin, we need the profits for each of the 180 vehicle sales listed in Table 2–4. This information is called raw or ungrouped data because it is simply a listing

E X A M P L E

Ms. Kathryn Ball of the Applewood Auto Group wants to summarize the quantitative variable profit with a frequency distribution and display the distribution with charts and graphs. With this information, Ms. Ball can easily answer the following ques- tions: What is the typical profit on each sale? What is the largest or maximum profit on any sale? What is the smallest or minimum profit on any sale? Around what value do the profits tend to cluster?

DESCRIBING DATA: FREQUENCY TABLES, FREQUENCY DISTRIBUTIONS, AND GRAPHIC PRESENTATION 27

of the individual, observed profits. It is possible to search the list and find the smallest or minimum profit ($294) and the largest or maximum profit ($3,292), but that is about all. It is difficult to determine a typical profit or to visualize where the profits tend to cluster. The raw data are more easily interpreted if we summarize the data with a frequency distribution. The steps to create this frequency distribu- tion follow.

Step 1: Decide on the number of classes. A useful recipe to determine the number of classes (k) is the “2 to the k rule.” This guide suggests you select the smallest number (k) for the number of classes such that 2k (in words, 2 raised to the power of k) is greater than the number of observations (n). In the Applewood Auto Group example, there were 180 vehicles sold. So n = 180. If we try k = 7, which means we would use 7 classes, 27 = 128, which is less than 180. Hence, 7 is too few classes. If we let k = 8, then 28 = 256, which is greater than 180. So the recommended number of classes is 8.

Step 2: Determine the class interval. Generally, the class interval is the same for all classes. The classes all taken together must cover at least the distance from the minimum value in the data up to the max- imum value. Expressing these words in a formula:

i ≥ Maximum Value − Minimum Value

k where i is the class interval, and k is the number of classes.

For the Applewood Auto Group, the minimum value is $294 and the maximum value is $3,292. If we need 8 classes, the interval should be:

i ≥ Maximum Value − Minimum Value

k =

$3,292 − $294 8

= $374.75

In practice, this interval size is usually rounded up to some conve- nient number, such as a multiple of 10 or 100. The value of $400 is a reasonable choice.

Step 3: Set the individual class limits. State clear class limits so you can put each observation into only one category. This means you must avoid overlapping or unclear class limits. For example, classes such as “$1,300–$1,400” and “$1,400–$1,500” should not be used because it is not clear whether the value $1,400 is in the first or second class. In this text, we will generally use the format $1,300 up to $1,400 and $1,400 up to $1,500 and so on. With this format, it is clear that $1,399 goes into the first class and $1,400 in the second.

Because we always round the class interval up to get a conve- nient class size, we cover a larger than necessary range. For ex- ample, using 8 classes with an interval of $400 in the Applewood Auto Group example results in a range of 8($400) = $3,200. The actual range is $2,998, found by ($3,292 − $294). Comparing that value to $3,200, we have an excess of $202. Because we need to cover only the range (Maximum − Minimum), it is natural to put ap- proximately equal amounts of the excess in each of the two tails. Of course, we also should select convenient class limits. A guide- line is to make the lower limit of the first class a multiple of the class interval. Sometimes this is not possible, but the lower limit should at least be rounded. So here are the classes we could use for these data.

28 CHAPTER 2

Classes

$ 200 up to $ 600 600 up to 1,000 1,000 up to 1,400 1,400 up to 1,800 1,800 up to 2,200 2,200 up to 2,600 2,600 up to 3,000 3,000 up to 3,400

Profit Frequency

$ 200 up to $ 600 |||| ||| 600 up to 1,000 |||| |||| | 1,000 up to 1,400 |||| |||| |||| |||| ||| 1,400 up to 1,800 |||| |||| |||| |||| |||| |||| |||| ||| 1,800 up to 2,200 |||| |||| |||| |||| |||| |||| |||| |||| |||| 2,200 up to 2,600 |||| |||| |||| |||| |||| || 2,600 up to 3,000 |||| |||| |||| |||| 3,000 up to 3,400 ||||

Step 4: Tally the vehicle profit into the classes and determine the number of observations in each class. To begin, the profit from the sale of the first vehicle in Table 2–4 is $1,387. It is tallied in the $1,000 up to $1,400 class. The second profit in the first row of Table 2–4 is $2,148. It is tallied in the $1,800 up to $2,200 class. The other profits are tallied in a similar manner. When all the profits are tallied, the table would appear as:

The number of observations in each class is called the class frequency. In the $200 up to $600 class there are 8 observations, and in the $600 up to $1,000 class there are 11 observations. There- fore, the class frequency in the first class is 8 and the class frequency in the second class is 11. There are a total of 180 observations in the entire set of data. So the sum of all the frequencies should be equal to 180. The results of the frequency distribution are in Table 2–5.

Now that we have organized the data into a frequency distribution (see Table 2–5), we can summarize the profits of the vehicles for the Applewood Auto Group. Observe the following:

1. The profits from vehicle sales range between $200 and $3,400. 2. The vehicle profits are classified using a class interval of $400. The class inter-

val is determined by subtracting consecutive lower or upper class limits. For

TABLE 2–5 Frequency Distribution of Profit for Vehicles Sold Last Month at Applewood Auto Group

Profit Frequency

$ 200 up to $ 600 8 600 up to 1,000 11 1,000 up to 1,400 23 1,400 up to 1,800 38 1,800 up to 2,200 45 2,200 up to 2,600 32 2,600 up to 3,000 19 3,000 up to 3,400 4

Total 180

DESCRIBING DATA: FREQUENCY TABLES, FREQUENCY DISTRIBUTIONS, AND GRAPHIC PRESENTATION 29

example, the lower limit of the first class is $200, and the lower limit of the second class is $600. The difference is the class interval of $400.

3. The profits are concentrated between $1,000 and $3,000. The profit on 157 vehicles, or 87%, was within this range.

4. For each class, we can determine the typical profit or class midpoint. It is half- way between the lower or upper limits of two consecutive classes. It is com- puted by adding the lower or upper limits of consecutive classes and dividing by 2. Referring to Table 2–5, the lower class limit of the first class is $200, and the next class limit is $600. The class midpoint is $400, found by ($600 + $200)/2. The midpoint best represents, or is typical of, the profits of the vehi- cles in that class. Applewood sold 8 vehicles with a typical profit of $400.

5. The largest concentration, or highest frequency, of vehicles sold is in the $1,800 up to $2,200 class. There are 45 vehicles in this class. The class midpoint is $2,000. So we say that the typical profit in the class with the highest frequency is $2,000.

By presenting this information to Ms. Ball, we give her a clear picture of the distribu- tion of the vehicle profits for last month.

We admit that arranging the information on profits into a frequency distribution does result in the loss of some detailed information. That is, by organizing the data into a frequency distribution, we cannot pinpoint the exact profit on any vehicle, such as $1,387, $2,148, or $2,201. Further, we cannot tell that the actual minimum profit for any vehicle sold is $294 or that the maximum profit was $3,292. However, the lower limit of the first class and the upper limit of the last class convey essen- tially the same meaning. Likely, Ms. Ball will make the same judgment if she knows the smallest profit is about $200 that she will if she knows the exact profit is $292. The advantages of summarizing the 180 profits into a more understandable and organized form more than offset this disadvantage.

Number of Returns Adjusted Gross Income (in thousands)

No adjusted gross income 178.2 $ 1 up to 5,000 1,204.6 5,000 up to 10,000 2,595.5 10,000 up to 15,000 3,142.0 15,000 up to 20,000 3,191.7 20,000 up to 25,000 2,501.4 25,000 up to 30,000 1,901.6 30,000 up to 40,000 2,502.3 40,000 up to 50,000 1,426.8 50,000 up to 75,000 1,476.3 75,000 up to 100,000 338.8 100,000 up to 200,000 223.3 200,000 up to 500,000 55.2 500,000 up to 1,000,000 12.0 1,000,000 up to 2,000,000 5.1 2,000,000 up to 10,000,000 3.4 10,000,000 or more 0.6

TABLE 2–6 Adjusted Gross Income for Individuals Filing Income Tax Returns

When we summarize raw data with frequency distributions, equal class intervals are pre- ferred. However, in certain situations unequal class intervals may be necessary to avoid a large number of classes with very small frequencies. Such is the case in Table 2–6. The U.S. Internal Revenue Service uses unequal-sized class intervals for adjusted gross income on individual tax returns to summarize the number of individual tax returns. If we use our method to find equal class intervals, the 2k rule results in 25 classes, and

STATISTICS IN ACTION

In 1788, James Madison, John Jay, and Alexander Hamilton anonymously published a series of essays entitled The Federalist. These Federalist papers were an attempt to convince the people of New York that they should ratify the Constitution. In the course of history, the authorship of most of these papers became known, but 12 re- mained contested. Through the use of statistical analysis, and particularly studying the frequency distributions of various words, we can now conclude that James Madison is the likely author of the 12 papers. In fact, the statistical evidence that Madison is the author is overwhelming.

30 CHAPTER 2

a class interval of $400,000, assuming $0 and $10,000,000 as the minimum and maximum values for adjusted gross income. Using equal class intervals, the first 13 classes in Table 2–6 would be combined into one class of about 99.9% of all tax returns and 24 classes for the 0.1% of the returns with an adjusted gross income above $400,000. Using equal class inter- vals does not provide a good understanding of the raw data. In this case, good judgment in the use of unequal class intervals, as demonstrated in Table 2–6, is required to show the distribution of the number of tax returns filed, especially for incomes under $500,000.

In the first quarter of last year, the 11 members of the sales staff at Master Chemical Company earned the following commissions:

$1,650 $1,475 $1,510 $1,670 $1,595 $1,760 $1,540 $1,495 $1,590 $1,625 $1,510

(a) What are the values such as $1,650 and $1,475 called? (b) Using $1,400 up to $1,500 as the first class, $1,500 up to $1,600 as the second class,

and so forth, organize the quarterly commissions into a frequency distribution. (c) What are the numbers in the right column of your frequency distribution called? (d) Describe the distribution of quarterly commissions, based on the frequency distribu-

tion. What is the largest concentration of commissions earned? What is the smallest, and the largest? What is the typical amount earned?

Relative Frequency Distribution It may be desirable, as we did earlier with qualitative data, to convert class frequencies to relative class frequencies to show the proportion of the total number of observations in each class. In our vehicle profits, we may want to know what percentage of the vehi- cle profits are in the $1,000 up to $1,400 class. To convert a frequency distribution to a relative frequency distribution, each of the class frequencies is divided by the total num- ber of observations. From the distribution of vehicle profits, Table 2–5, the relative fre- quency for the $1,000 up to $1,400 class is 0.128, found by dividing 23 by 180. That is, profit on 12.8% of the vehicles sold is between $1,000 and $1,400. The relative fre- quencies for the remaining classes are shown in Table 2–7.

S E L F - R E V I E W 2–2

TABLE 2–7 Relative Frequency Distribution of Profit for Vehicles Sold Last Month at Applewood Auto Group

Profit Frequency Relative Frequency Found by

$ 200 up to $ 600 8 .044 8/180 600 up to 1,000 11 .061 11/180 1,000 up to 1,400 23 .128 23/180 1,400 up to 1,800 38 .211 38/180 1,800 up to 2,200 45 .250 45/180 2,200 up to 2,600 32 .178 32/180 2,600 up to 3,000 19 .106 19/180 3,000 up to 3,400 4 .022 4/180

Total 180 1.000

There are many software packages that perform statistical calculations. Throughout this text, we will show the output from Microsoft Excel, MegaStat (a Microsoft Excel add-in), and Minitab (a statistical software package). Because Excel is most readily available, it is used most frequently.

Within the earlier Graphic Presentation of Qualitative Data section, we used the Pivot Table tool in Excel to create a frequency table. To create the table to the left, we use the same Excel tool to

DESCRIBING DATA: FREQUENCY TABLES, FREQUENCY DISTRIBUTIONS, AND GRAPHIC PRESENTATION 31

compute frequency and relative frequency distributions for the profit variable in the Applewood Auto Group data. The necessary steps are given in the Software Commands section in Appendix C.

Barry Bonds of the San Francisco Giants established a new single-season Major League Baseball home run record by hitting 73 home runs during the 2001 season. Listed below is the sorted distance of each of the 73 home runs.

S E L F - R E V I E W 2–3

(a) For this data, show that seven classes would be used to create a frequency distribution using the 2k rule.

(b) Show that a class interval of 30 would summarize the data in seven classes. (c) Construct frequency and relative frequency distributions for the data with

seven classes and a class interval of 30. Start the first class with a lower limit of 300.

(d) How many home runs traveled a distance of 360 up to 390 feet? (e) What percentage of the home runs traveled a distance of 360 up to 390 feet? (f) What percentage of the home runs traveled a distance of 390 feet or more?

7. A set of data consists of 38 observations. How many classes would you recom- mend for the frequency distribution?

8. A set of data consists of 45 observations between $0 and $29. What size would you recommend for the class interval?

9. A set of data consists of 230 observations between $235 and $567. What class interval would you recommend?

10. A set of data contains 53 observations. The minimum value is 42 and the maximum value is 129. The data are to be organized into a frequency distribution.

a. How many classes would you suggest? b. What would you suggest as the lower limit of the first class?

11. Wachesaw Manufacturing Inc. produced the following number of units in the last 16 days.

The information is to be organized into a frequency distribution. a. How many classes would you recommend? b. What class interval would you suggest? c. What lower limit would you recommend for the first class? d. Organize the information into a frequency distribution and determine the relative

frequency distribution. e. Comment on the shape of the distribution.

E X E R C I S E S This icon indicates that the data are available at the text website: www.mhhe.com/ Lind17e. You will be able to download the data directly into Excel or Minitab from this site.

27 27 27 28 27 25 25 28 26 28 26 28 31 30 26 26

320 320 347 350 360 360 360 361 365 370 370 375 375 375 375 380 380 380 380 380 380 390 390 391 394 396 400 400 400 400 405 410 410 410 410 410 410 410 410 410 410 410 411 415 415 416 417 417 420 420 420 420 420 420 420 420 429 430 430 430 430 430 435 435 436 440 440 440 440 440 450 480 488

32 CHAPTER 2

The data are to be organized into a frequency distribution. a. How many classes would you recommend? b. What class interval would you suggest? c. What lower limit would you recommend for the first class? d. Organize the number of oil changes into a frequency distribution. e. Comment on the shape of the frequency distribution. Also determine the relative

frequency distribution.

13. The manager of the BiLo Supermarket in Mt. Pleasant, Rhode Island, gathered the following information on the number of times a customer visits the store during a month. The responses of 51 customers were:

65 98 55 62 79 59 51 90 72 56 70 62 66 80 94 79 63 73 71 85

12. The Quick Change Oil Company has a number of outlets in the metropolitan Seat- tle area. The daily number of oil changes at the Oak Street outlet in the past 20 days are:

5 3 3 1 4 4 5 6 4 2 6 6 6 7 1 1 14 1 2 4 4 4 5 6 3 5 3 4 5 6 8 4 7 6 5 9 11 3 12 4 7 6 5 15 1 1 10 8 9 2 12

a. Starting with 0 as the lower limit of the first class and using a class interval of 3, organize the data into a frequency distribution.

b. Describe the distribution. Where do the data tend to cluster? c. Convert the distribution to a relative frequency distribution.

14. The food services division of Cedar River Amusement Park Inc. is studying the amount of money spent per day on food and drink by families who visit the amuse- ment park. A sample of 40 families who visited the park yesterday revealed they spent the following amounts:

$77 $18 $63 $84 $38 $54 $50 $59 $54 $56 $36 $26 $50 $34 $44 41 58 58 53 51 62 43 52 53 63 62 62 65 61 52 60 60 45 66 83 71 63 58 61 71

a. Organize the data into a frequency distribution, using seven classes and 15 as the lower limit of the first class. What class interval did you select?

b. Where do the data tend to cluster? c. Describe the distribution. d. Determine the relative frequency distribution.

GRAPHIC PRESENTATION OF A DISTRIBUTION Sales managers, stock analysts, hospital administrators, and other busy executives of- ten need a quick picture of the distributions of sales, stock prices, or hospital costs. These distributions can often be depicted by the use of charts and graphs. Three charts that will help portray a frequency distribution graphically are the histogram, the fre- quency polygon, and the cumulative frequency polygon.

Histogram A histogram for a frequency distribution based on quantitative data is similar to the bar chart showing the distribution of qualitative data. The classes are marked on the

LO2-4 Display a distribution using a histogram or frequency polygon.

DESCRIBING DATA: FREQUENCY TABLES, FREQUENCY DISTRIBUTIONS, AND GRAPHIC PRESENTATION 33

horizontal axis and the class frequencies on the vertical axis. The class frequencies are represented by the heights of the bars. However, there is one important differ- ence based on the nature of the data. Quantitative data are usually measured using scales that are continuous, not discrete. Therefore, the horizontal axis represents all possible values, and the bars are drawn adjacent to each other to show the continu- ous nature of the data.

HISTOGRAM A graph in which the classes are marked on the horizontal axis and the class frequencies on the vertical axis. The class frequencies are represented by the heights of the bars, and the bars are drawn adjacent to each other.

E X A M P L E

Below is the frequency distribution of the profits on vehicle sales last month at the Applewood Auto Group.

Construct a histogram. What observations can you reach based on the information presented in the histogram?

S O L U T I O N

The class frequencies are scaled along the vertical axis (Y-axis) and either the class limits or the class midpoints along the horizontal axis. To illustrate the construction of the histogram, the first three classes are shown in Chart 2–3.

Profit Frequency

$ 200 up to $ 600 8 600 up to 1,000 11 1,000 up to 1,400 23 1,400 up to 1,800 38 1,800 up to 2,200 45 2,200 up to 2,600 32 2,600 up to 3,000 19 3,000 up to 3,400 4

Total 180

200 600 1,000 1,400

32

24

16

8 8

11

23

N um

be r

of V

eh ic

le s

(c la

ss fr

eq ue

nc y)

Pro�t $

CHART 2–3 Construction of a Histogram

34 CHAPTER 2

From Chart 2–3 we note the profit on eight vehicles was $200 up to $600. There- fore, the height of the column for that class is 8. There are 11 vehicle sales where the profit was $600 up to $1,000. So, logically, the height of that column is 11. The height of the bar represents the number of observations in the class.

This procedure is continued for all classes. The complete histogram is shown in Chart 2–4. Note that there is no space between the bars. This is a feature of the histogram. Why is this so? Because the variable profit, plotted on the horizontal axis, is a continuous variable. In a bar chart, the scale of measurement is usually nominal and the vertical bars are separated. This is an important distinction be- tween the histogram and the bar chart.

We can make the following statements using Chart 2–4. They are the same as the observations based on Table 2–5.

1. The profits from vehicle sales range between $200 and $3,400. 2. The vehicle profits are classified using a class interval of $400. The class inter-

val is determined by subtracting consecutive lower or upper class limits. For example, the lower limit of the first class is $200, and the lower limit of the second class is $600. The difference is the class interval or $400.

3. The profits are concentrated between $1,000 and $3,000. The profit on 157 vehicles, or 87%, was within this range.

4. For each class, we can determine the typical profit or class midpoint. It is halfway between the lower or upper limits of two consecutive classes. It is computed by adding the lower or upper limits of consecutive classes and dividing by 2. Refer- ring to Chart 2–4, the lower class limit of the first class is $200, and the next class limit is $600. The class midpoint is $400, found by ($600 + $200)/2. The mid- point best represents, or is typical of, the profits of the vehicles in that class. Applewood sold 8 vehicles with a typical profit of $400.

5. The largest concentration, or highest frequency of vehicles sold, is in the $1,800 up to $2,200 class. There are 45 vehicles in this class. The class midpoint is $2,000. So we say that the typical profit in the class with the highest frequency is $2,000.

Thus, the histogram provides an easily interpreted visual representation of a frequency distribution. We should also point out that we would have made the same observations and the shape of the histogram would have been the same had we used a relative frequency distribution instead of the actual frequencies. That is, if we use the relative frequencies of Table 2–7, the result is a histogram of the same shape as Chart 2–4. The only difference is that the vertical axis would have been reported in percentage of vehicles instead of the number of vehicles. The Excel commands to create Chart 2–4 are given in Appendix C.

20 0–

60 0

60 0–

1,0 00

1,0 00

–1 ,40

0

1,4 00

–1 ,80

0

1,8 00

–2 ,20

0

2,2 00

–2 ,60

0

2,6 00

–3 ,00

0

3,0 00

–3 ,40

0

10

0

30

20

Pro�t

11

23

38

45

32

19

4 8

40

v

Fr eq

ue nc

y

CHART 2–4 Histogram of the Profit on 180 Vehicles Sold at the Applewood Auto Group

DESCRIBING DATA: FREQUENCY TABLES, FREQUENCY DISTRIBUTIONS, AND GRAPHIC PRESENTATION 35

Frequency Polygon A frequency polygon also shows the shape of a distribution and is similar to a histo- gram. It consists of line segments connecting the points formed by the intersections of the class midpoints and the class frequencies. The construction of a frequency polygon is illustrated in Chart 2–5. We use the profits from the cars sold last month at the Apple- wood Auto Group. The midpoint of each class is scaled on the X-axis and the class frequencies on the Y-axis. Recall that the class midpoint is the value at the center of a class and represents the typical values in that class. The class frequency is the number of observations in a particular class. The profit earned on the vehicles sold last month by the Applewood Auto Group is repeated below.

STATISTICS IN ACTION

Florence Nightingale is known as the founder of the nursing profession. However, she also saved many lives by using statisti- cal analysis. When she encountered an unsanitary condition or an undersup- plied hospital, she improved the conditions and then used statistical data to document the improve- ment. Thus, she was able to convince others of the need for medical reform, particularly in the area of sanitation. She developed original graphs to demon- strate that, during the Crimean War, more soldiers died from unsanitary condi- tions than were killed in combat.

Fr eq

ue nc

y

8

24

40

48

16

4000

Pro�t $

32

800 1,200 1,600 2,000 2,400 2,800 3,200 3,600

CHART 2–5 Frequency Polygon of Profit on 180 Vehicles Sold at Applewood Auto Group

As noted previously, the $200 up to $600 class is represented by the midpoint $400. To construct a frequency polygon, move horizontally on the graph to the mid- point, $400, and then vertically to 8, the class frequency, and place a dot. The x and the y values of this point are called the coordinates. The coordinates of the next point are x = 800 and y = 11. The process is continued for all classes. Then the points are connected in order. That is, the point representing the lowest class is joined to the one  representing the second class and so on. Note in Chart 2–5 that, to complete the  frequency polygon, midpoints of $0 and $3,600 are added to the X-axis to “anchor” the polygon at zero frequencies. These two values, $0 and $3,600, were derived by subtracting the class interval of $400 from the lowest midpoint ($400) and by adding $400 to the highest midpoint ($3,200) in the frequency distribution.

Both the histogram and the frequency polygon allow us to get a quick picture of the main characteristics of the data (highs, lows, points of concentration, etc.). Although the two representations are similar in purpose, the histogram has the advantage of depicting each class as a rectangle, with the height of the rectangular bar representing

Profit Midpoint Frequency

$ 200 up to $ 600 $ 400 8 600 up to 1,000 800 11 1,000 up to 1,400 1,200 23 1,400 up to 1,800 1,600 38 1,800 up to 2,200 2,000 45 2,200 up to 2,600 2,400 32 2,600 up to 3,000 2,800 19 3,000 up to 3,400 3,200 4

Total 180

36 CHAPTER 2

8

24

40

48

56

16

4000

Pro�t $

32

Fr eq

ue nc

y

800 1,200 1,600 2,000 2,400 2,800 3,200 3,600

Fowler Motors Applewood

CHART 2–6 Distribution of Profit at Applewood Auto Group and Fowler Motors

the number in each class. The frequency polygon, in turn, has an advantage over the histogram. It allows us to compare directly two or more frequency distributions. Sup- pose Ms. Ball wants to compare the profit per vehicle sold at Applewood Auto Group with a similar auto group, Fowler Auto in Grayling, Michigan. To do this, two frequency polygons are constructed, one on top of the other, as in Chart 2–6. Two things are clear from the chart:

• The typical vehicle profit is larger at Fowler Motors—about $2,000 for Applewood and about $2,400 for Fowler.

• There is less variation or dispersion in the profits at Fowler Motors than at Apple- wood. The lower limit of the first class for Applewood is $0 and the upper limit is $3,600. For Fowler Motors, the lower limit is $800 and the upper limit is the same: $3,600.

The total number of cars sold at the two dealerships is about the same, so a direct comparison is possible. If the difference in the total number of cars sold is large, then converting the frequencies to relative frequencies and then plotting the two distribu- tions would allow a clearer comparison.

The annual imports of a selected group of electronic suppliers are shown in the following frequency distribution.

S E L F - R E V I E W 2–4

Imports ($ millions) Number of Suppliers

2 up to 5 6 5 up to 8 13 8 up to 11 20 11 up to 14 10 14 up to 17 1

(a) Portray the imports as a histogram. (b) Portray the imports as a relative frequency polygon. (c) Summarize the important facets of the distribution (such as classes with the highest

and lowest frequencies).

DESCRIBING DATA: FREQUENCY TABLES, FREQUENCY DISTRIBUTIONS, AND GRAPHIC PRESENTATION 37

15. Molly’s Candle Shop has several retail stores in the coastal areas of North and South Carolina. Many of Molly’s customers ask her to ship their purchases. The fol- lowing chart shows the number of packages shipped per day for the last 100 days. For example, the first class shows that there were 5 days when the number of pack- ages shipped was 0 up to 5.

Fr eq

ue nc

y Number of Packages

10

0 5 10 15 20 25 30 35

20

30

13

28 23

18

10 3

5

a. What is this chart called? b. What is the total number of packages shipped? c. What is the class interval? d. What is the number of packages shipped in the 10 up to 15 class? e. What is the relative frequency of packages shipped in the 10 up to 15 class? f. What is the midpoint of the 10 up to 15 class? g. On how many days were there 25 or more packages shipped?

16. The following chart shows the number of patients admitted daily to Memorial Hospital through the emergency room.

0

10

20

30

2 4 6 8 10 12

Fr eq

ue nc

y

Number of Patients

a. What is the midpoint of the 2 up to 4 class? b. How many days were 2 up to 4 patients admitted? c. What is the class interval? d. What is this chart called?

17. The following frequency distribution reports the number of frequent flier miles, reported in thousands, for employees of Brumley Statistical Consulting Inc. during the most recent quarter.

E X E R C I S E S

Frequent Flier Miles Number of (000) Employees

0 up to 3 5 3 up to 6 12 6 up to 9 23 9 up to 12 8 12 up to 15 2 Total 50

38 CHAPTER 2

Cumulative Distributions Consider once again the distribution of the profits on vehicles sold by the Applewood Auto Group. Suppose we were interested in the number of vehicles that sold for a profit of less than $1,400. These values can be approximated by developing a cumulative frequency distribution and portraying it graphically in a cumulative frequency polygon. Or, suppose we were interested in the profit earned on the lowest-selling 40% of the ve- hicles. These values can be approximated by developing a cumulative relative frequency distribution and portraying it graphically in a cumulative relative frequency polygon.

a. How many employees were studied? b. What is the midpoint of the first class? c. Construct a histogram. d. A frequency polygon is to be drawn. What are the coordinates of the plot for the

first class? e. Construct a frequency polygon. f. Interpret the frequent flier miles accumulated using the two charts.

18. A large Internet retailer is studying the lead time (elapsed time between when an order is placed and when it is filled) for a sample of recent orders. The lead times are reported in days.

a. How many orders were studied? b. What is the midpoint of the first class? c. What are the coordinates of the first class for a frequency polygon? d. Draw a histogram. e. Draw a frequency polygon. f. Interpret the lead times using the two charts.

Lead Time (days) Frequency

0 up to 5 6 5 up to 10 7 10 up to 15 12 15 up to 20 8 20 up to 25 7 Total 40

E X A M P L E

The frequency distribution of the profits earned at Applewood Auto Group is repeated from Table 2–5.

Profit Frequency

$ 200 up to $ 600 8 600 up to 1,000 11 1,000 up to 1,400 23 1,400 up to 1,800 38 1,800 up to 2,200 45 2,200 up to 2,600 32 2,600 up to 3,000 19 3,000 up to 3,400 4

Total 180

DESCRIBING DATA: FREQUENCY TABLES, FREQUENCY DISTRIBUTIONS, AND GRAPHIC PRESENTATION 39

Construct a cumulative frequency polygon to answer the following question: sixty of the vehicles earned a profit of less than what amount? Construct a cumulative relative frequency polygon to answer this question: seventy-five percent of the vehicles sold earned a profit of less than what amount? 

S O L U T I O N

As the names imply, a cumulative frequency distribution and a cumulative fre- quency polygon require cumulative frequencies. To construct a cumulative fre- quency distribution, refer to the preceding table and note that there were eight vehicles in which the profit earned was less than $600. Those 8 vehicles, plus the 11 in the next higher class, for a total of 19, earned a profit of less than $1,000. The cumulative frequency for the next higher class is 42, found by 8 + 11 + 23. This process is continued for all the classes. All the vehicles earned a profit of less than $3,400. (See Table 2–8.)

TABLE 2–8 Cumulative Frequency Distribution for Profit on Vehicles Sold Last Month at Applewood Auto Group

Profit Cumulative Frequency Found by

Less than $ 600 8 8 Less than 1,000 19 8 + 11 Less than 1,400 42 8 + 11 + 23 Less than 1,800 80 8 + 11 + 23 + 38 Less than 2,200 125 8 + 11 + 23 + 38 + 45 Less than 2,600 157 8 + 11 + 23 + 38 + 45 + 32 Less than 3,000 176 8 + 11 + 23 + 38 + 45 + 32 + 19 Less than 3,400 180 8 + 11 + 23 + 38 + 45 + 32 + 19 + 4

TABLE 2–9 Cumulative Relative Frequency Distribution for Profit on Vehicles Sold Last Month at Applewood Auto Group

Profit Cumulative Frequency Cumulative Relative Frequency

Less than $ 600 8 8/180 = 0.044 = 4.4% Less than $ 1,000 19 19/180 = 0.106 = 10.6% Less than $ 1,400 42 42/180 = 0.233 = 23.3% Less than $ 1,800 80 80/180 = 0.444 = 44.4% Less than $2,200 125 125/180 = 0.694 = 69.4% Less than $2,600 157 157/180 = 0.872 = 87.2% Less than $3,000 176 176/180 = 0.978 = 97.8% Less than $3,400 180 180/180 = 1.000 = 100%

To construct a cumulative relative frequency distribution, we divide the cumulative frequencies by the total number of observations, 180. As shown in Table 2-9, the cumulative relative frequency of the fourth class is 80/180 = 44%. This means that 44% of the vehicles sold for less than $1,800.

To plot a cumulative frequency distribution, scale the upper limit of each class along the X-axis and the corresponding cumulative frequencies along the Y-axis. To provide additional information, you can label the vertical axis on the right in terms of cumulative relative frequencies. In the Applewood Auto Group,

40 CHAPTER 2

the vertical axis on the left is labeled from 0 to 180 and on the right from 0 to 100%. Note, as an example, that 50% on the right axis should be opposite 90 vehicles on the left axis.

To begin, the first plot is at x = 200 and y = 0. None of the vehicles sold for a profit of less than $200. The profit on 8 vehicles was less than $600, so the next plot is at x = 600 and y = 8. Continuing, the next plot is x = 1,000 and y = 19. There were 19 vehicles that sold for a profit of less than $1,000. The rest of the points are plotted and then the dots connected to form Chart 2–7.

We should point out that the shape of the distribution is the same if we use cumulative relative frequencies instead of the cumulative frequencies. The only difference is that the vertical axis is scaled in percentages. In the following charts, a percentage scale is added to the right side of the graphs to help answer ques- tions about cumulative relative frequencies.

200 600 1,000 1,400 1,800 2,200 2,600 3,000 3,400

N um

be r

of V

eh ic

le s

So ld

Pe rc

en t o

f V eh

ic le

s So

ld Pro�t $

100

75

50

25

0

20

40

60

80

100

120

140

160

180

CHART 2–7 Cumulative Frequency Polygon for Profit on Vehicles Sold Last Month at Applewood Auto Group

Using Chart 2–7 to find the amount of profit on 75% of the cars sold, draw a hori- zontal line from the 75% mark on the right-hand vertical axis over to the polygon, then drop down to the X-axis and read the amount of profit. The value on the X-axis is about $2,300, so we estimate that 75% of the vehicles sold earned a profit of $2,300 or less for the Applewood group.

To find the highest profit earned on 60 of the 180 vehicles, we use Chart 2–7 to locate the value of 60 on the left-hand vertical axis. Next, we draw a horizontal line from the value of 60 to the polygon and then drop down to the X-axis and read the profit. It is about $1,600, so we estimate that 60 of the vehicles sold for a profit of less than $1,600. We can also make estimates of the percentage of vehicles that sold for less than a particular amount. To explain, suppose we want to estimate the percentage of vehicles that sold for a profit of less than $2,000. We begin by locat- ing the value of $2,000 on the X-axis, move vertically to the polygon, and then horizontally to the vertical axis on the right. The value is about 56%, so we conclude 56% of the vehicles sold for a profit of less than $2,000.

DESCRIBING DATA: FREQUENCY TABLES, FREQUENCY DISTRIBUTIONS, AND GRAPHIC PRESENTATION 41

A sample of the hourly wages of 15 employees at Home Depot in Brunswick, Georgia, was organized into the following table.

Hourly Wages Number of Employees

$ 8 up to $10 3   10 up to 12 7   12 up to 14 4   14 up to 16 1

(a) What is the table called? (b) Develop a cumulative frequency distribution and portray the distribution in a cumula-

tive frequency polygon. (c) On the basis of the cumulative frequency polygon, how many employees earn less

than $11 per hour?

S E L F - R E V I E W 2–5

19. The following cumulative frequency and the cumulative relative frequency polygon for the distribution of hourly wages of a sample of certified welders in the Atlanta, Georgia, area is shown in the graph. 

Fr eq

ue nc

y

Hourly Wage

Pe rc

en t

0 5 10 15 20 25 30

100

75

50

25

40

30

20

10

a. How many welders were studied? b. What is the class interval? c. About how many welders earn less than $10.00 per hour? d. About 75% of the welders make less than what amount? e. Ten of the welders studied made less than what amount? f. What percent of the welders make less than $20.00 per hour?

20. The cumulative frequency and the cumulative relative frequency polygon for a dis- tribution of selling prices ($000) of houses sold in the Billings, Montana, area is shown in the graph. 

Fr eq

ue nc

y

Pe rc

en t

200

150

100

50

100

75

50

25

Selling Price ($000)

500 100 150 200 250 350300

E X E R C I S E S

42 CHAPTER 2

a. How many homes were studied? b. What is the class interval? c. One hundred homes sold for less than what amount? d. About 75% of the homes sold for less than what amount? e. Estimate the number of homes in the $150,000 up to $200,000 class. f. About how many homes sold for less than $225,000?

21. The frequency distribution representing the number of frequent flier miles accumulated by employees at Brumley Statistical Consulting Inc. is repeated from Exercise 17.

Frequent Flier Miles (000) Frequency

0 up to 3 5 3 up to 6 12 6 up to 9 23 9 up to 12 8 12 up to 15 2

Total 50

a. How many employees accumulated less than 3,000 miles? b. Convert the frequency distribution to a cumulative frequency distribution. c. Portray the cumulative distribution in the form of a cumulative frequency polygon. d. Based on the cumulative relative frequencies, about 75% of the employees

accumulated how many miles or less? 22. The frequency distribution of order lead time of the retailer from Exercise 18 is

repeated below.

Lead Time (days) Frequency

0 up to 5 6 5 up to 10 7 10 up to 15 12 15 up to 20 8 20 up to 25 7

Total 40

a. How many orders were filled in less than 10 days? In less than 15 days? b. Convert the frequency distribution to cumulative frequency and cumulative rela-

tive frequency distributions.  c. Develop a cumulative frequency polygon. d. About 60% of the orders were filled in less than how many days?

C H A P T E R S U M M A R Y

I. A frequency table is a grouping of qualitative data into mutually exclusive and collectively exhaustive classes showing the number of observations in each class.

II. A relative frequency table shows the fraction of the number of frequencies in each class. III. A bar chart is a graphic representation of a frequency table. IV. A pie chart shows the proportion each distinct class represents of the total number of

observations. V. A frequency distribution is a grouping of data into mutually exclusive and collectively ex-

haustive classes showing the number of observations in each class. A. The steps in constructing a frequency distribution are

1. Decide on the number of classes. 2. Determine the class interval. 3. Set the individual class limits. 4. Tally the raw data into classes and determine the frequency in each class.

DESCRIBING DATA: FREQUENCY TABLES, FREQUENCY DISTRIBUTIONS, AND GRAPHIC PRESENTATION 43

B. The class frequency is the number of observations in each class. C. The class interval is the difference between the limits of two consecutive classes. D. The class midpoint is halfway between the limits of consecutive classes.

VI. A relative frequency distribution shows the percent of observations in each class. VII. There are several methods for graphically portraying a frequency distribution.

A. A histogram portrays the frequencies in the form of a rectangle or bar for each class. The height of the rectangles is proportional to the class frequencies.

B. A frequency polygon consists of line segments connecting the points formed by the intersection of the class midpoint and the class frequency.

C. A graph of a cumulative frequency distribution shows the number of observations less than a given value.

D. A graph of a cumulative relative frequency distribution shows the percent of observa- tions less than a given value. 

C H A P T E R E X E R C I S E S

23. Describe the similarities and differences of qualitative and quantitative variables. Be sure to include the following: a. What level of measurement is required for each variable type? b. Can both types be used to describe both samples and populations?

24. Describe the similarities and differences between a frequency table and a frequency distribution. Be sure to include which requires qualitative data and which requires quan- titative data.

25. Alexandra Damonte will be building a new resort in Myrtle Beach, South Carolina. She must decide how to design the resort based on the type of activities that the resort will offer to its customers. A recent poll of 300 potential customers showed the following results about customers’ preferences for planned resort activities:

Like planned activities 63 Do not like planned activities 135 Not sure 78 No answer 24

a. What is the table called? b. Draw a bar chart to portray the survey results. c. Draw a pie chart for the survey results. d. If you are preparing to present the results to Ms. Damonte as part of a report, which

graph would you prefer to show? Why? 26. Speedy Swift is a package delivery service that serves the greater Atlanta, Georgia,

metropolitan area. To maintain customer loyalty, one of Speedy Swift’s performance objectives is on-time delivery. To monitor its performance, each delivery is measured on the following scale: early (package delivered before the promised time), on-time (pack- age delivered within 5 minutes of the promised time), late (package delivered more than 5 minutes past the promised time), or lost (package never delivered). Speedy Swift’s objective is to deliver 99% of all packages either early or on-time. Speedy collected the following data for last month’s performance:

On-time On-time Early Late On-time On-time On-time On-time Late On-time Early On-time On-time Early On-time On-time On-time On-time On-time On-time Early On-time Early On-time On-time On-time Early On-time On-time On-time Early On-time On-time Late Early Early On-time On-time On-time Early On-time Late Late On-time On-time On-time On-time On-time On-time On-time On-time Late Early On-time Early On-time Lost On-time On-time On-time Early Early On-time On-time Late Early Lost On-time On-time On-time On-time On-time Early On-time Early On-time Early On-time Late On-time On-time Early On-time On-time On-time Late On-time Early On-time On-time On-time On-time On-time On-time On-time Early Early On-time On-time On-time

44 CHAPTER 2

a. What kind of variable is delivery performance? What scale is used to measure delivery performance? 

b. Construct a frequency table for delivery performance for last month. c. Construct a relative frequency table for delivery performance last month. d. Construct a bar chart of the frequency table for delivery performance for last month. e. Construct a pie chart of on-time delivery performance for last month. f. Write a memo reporting the results of the analyses. Include your tables and graphs with

written descriptions of what they show. Conclude with a general statement of last month’s delivery performance as it relates to Speedy Swift’s performance objectives. 

27. A data set consists of 83 observations. How many classes would you recommend for a frequency distribution?

28. A data set consists of 145 observations that range from 56 to 490. What size class inter- val would you recommend?

29. The following is the number of minutes to commute from home to work for a group of 25 automobile executives.

28 25 48 37 41 19 32 26 16 23 23 29 36 31 26 21 32 25 31 43 35 42 38 33 28

a. How many classes would you recommend? b. What class interval would you suggest? c. What would you recommend as the lower limit of the first class? d. Organize the data into a frequency distribution. e. Comment on the shape of the frequency distribution.

30. The following data give the weekly amounts spent on groceries for a sample of 45 households.

$271 $363 $159 $ 76 $227 $337 $295 $319 $250 279 205 279 266 199 177 162 232 303 192 181 321 309 246 278 50 41 335 116 100 151 240 474 297 170 188 320 429 294 570 342 279 235 434 123 325

a. How many classes would you recommend? b. What class interval would you suggest? c. What would you recommend as the lower limit of the first class? d. Organize the data into a frequency distribution.

31. A social scientist is studying the use of iPods by college students. A sample of 45 students revealed they played the following number of songs yesterday.

4 6 8 7 9 6 3 7 7 6 7 1 4 7 7 4 6 4 10 2 4 6 3 4 6 8 4 3 3 6 8 8 4 6 4 6 5 5 9 6 8 8 6 5 10

Organize the information into a frequency distribution. a. How many classes would you suggest? b. What is the most suitable class interval? c. What is the lower limit of the initial class? d. Create the frequency distribution. e. Describe the shape of the distribution.

32. David Wise handles his own investment portfolio, and has done so for many years. Listed below is the holding time (recorded to the nearest whole year) between purchase and sale for his collection of 36 stocks.

8 8 6 11 11 9 8 5 11 4 8 5 14 7 12 8 6 11 9 7 9 15 8 8 12 5 9 8 5 9 10 11 3 9 8 6

a. How many classes would you propose? b. What class interval would you suggest? c. What quantity would you use for the lower limit of the initial class?

DESCRIBING DATA: FREQUENCY TABLES, FREQUENCY DISTRIBUTIONS, AND GRAPHIC PRESENTATION 45

d. Using your responses to parts (a), (b), and (c), create a frequency distribution. e. Describe the shape of the frequency distribution.

33. You are exploring the music in your iTunes library. The total play counts over the past year for the 27 songs on your “smart playlist” are shown below. Make a frequency distribu- tion of the counts and describe its shape. It is often claimed that a small fraction of a person’s songs will account for most of their total plays. Does this seem to be the case here?

128 56 54 91 190 23 160 298 445 50 578 494 37 677 18 74 70 868 108 71 466 23 84 38 26 814 17

34. The monthly issues of the Journal of Finance are available on the Internet. The table below shows the number of times an issue was downloaded over the last 33  months. Suppose that you wish to summarize the number of downloads with a frequency distribution.

312 2,753 2,595 6,057 7,624 6,624 6,362 6,575 7,760 7,085 7,272 5,967 5,256 6,160 6,238 6,709 7,193 5,631 6,490 6,682 7,829 7,091 6,871 6,230 7,253 5,507 5,676 6,974 6,915 4,999 5,689 6,143 7,086

a. How many classes would you propose? b. What class interval would you suggest? c. What quantity would you use for the lower limit of the initial class? d. Using your responses to parts (a), (b), and (c), create a frequency distribution. e. Describe the shape of the frequency distribution.

35. The following histogram shows the scores on the first exam for a statistics class.

50 60 70 80 90 100

25 20 15 10

5 0

Score

Fr eq

ue nc

y

3

14

21

12

6

a. How many students took the exam? b. What is the class interval? c. What is the class midpoint for the first class? d. How many students earned a score of less than 70?

36. The following chart summarizes the selling price of homes sold last month in the Sarasota, Florida, area.

100

75

50

25

250 200 150 100 50

0 50 100 150 Selling Price ($000)

200 250 300 350

Fr eq

ue nc

y

Pe rc

en t

a. What is the chart called? b. How many homes were sold during the last month? c. What is the class interval? d. About 75% of the houses sold for less than what amount? e. One hundred seventy-five of the homes sold for less than what amount?

46 CHAPTER 2

37. A chain of sport shops catering to beginning skiers, headquartered in Aspen, Colorado, plans to conduct a study of how much a beginning skier spends on his or her initial purchase of equipment and supplies. Based on these figures, it wants to explore the possibility of offering combinations, such as a pair of boots and a pair of skis, to induce customers to buy more. A sample of 44 cash register receipts revealed these initial purchases:

$140 $ 82 $265 $168 $ 90 $114 $172 $230 $142 86 125 235 212 171 149 156 162 118 139 149 132 105 162 126 216 195 127 161 135 172 220 229 129 87 128 126 175 127 149 126 121 118 172 126

a. Arrive at a suggested class interval. b. Organize the data into a frequency distribution using a lower limit of $70. c. Interpret your findings.

38. The numbers of outstanding shares for 24 publicly traded companies are listed in the following table.

Number of Outstanding Shares Company (millions)

Southwest Airlines 738 FirstEnergy 418 Harley Davidson 226 Entergy 178 Chevron 1,957 Pacific Gas and Electric 430 DuPont 932 Westinghouse 22 Eversource 314 Facebook 1,067 Google, Inc. 64 Apple 941

Number of Outstanding Shares Company (millions)

Costco 436 Home Depot 1,495 DTE Energy 172 Dow Chemical 1,199 Eastman Kodak 272 American Electric Power 485 ITT Corporation 93 Ameren 243 Virginia Electric and Power 575 Public Service Electric & Gas 506 Consumers Energy 265 Starbucks 744

a. Using the number of outstanding shares, summarize the companies with a frequency distribution.

b. Display the frequency distribution with a frequency polygon. c. Create a cumulative frequency distribution of the outstanding shares. d. Display the cumulative frequency distribution with a cumulative frequency polygon. e. Based on the cumulative relative frequency distribution, 75% of the companies have

less than “what number” of outstanding shares? f. Write a brief analysis of this group of companies based on your statistical summaries

of “number of outstanding shares.” 39. A recent survey showed that the typical American car owner spends $2,950 per year on

operating expenses. Below is a breakdown of the various expenditure items. Draw an appropriate chart to portray the data and summarize your findings in a brief report.

Expenditure Item Amount

Fuel $ 603 Interest on car loan 279 Repairs 930 Insurance and license 646 Depreciation 492

Total $2,950

DESCRIBING DATA: FREQUENCY TABLES, FREQUENCY DISTRIBUTIONS, AND GRAPHIC PRESENTATION 47

40. Midland National Bank selected a sample of 40 student checking accounts. Below are their end-of-the-month balances.

$404 $ 74 $234 $149 $279 $215 $123 $ 55 $ 43 $321 87 234 68 489 57 185 141 758 72 863

703 125 350 440 37 252 27 521 302 127 968 712 503 489 327 608 358 425 303 203

a. Tally the data into a frequency distribution using $100 as a class interval and $0 as the starting point.

b. Draw a cumulative frequency polygon. c. The bank considers any student with an ending balance of $400 or more a “pre-

ferred customer.” Estimate the percentage of preferred customers. d. The bank is also considering a service charge to the lowest 10% of the ending bal-

ances. What would you recommend as the cutoff point between those who have to pay a service charge and those who do not?

41. Residents of the state of South Carolina earned a total of $69.5 billion in adjusted gross income. Seventy-three percent of the total was in wages and salaries; 11% in dividends, interest, and capital gains; 8% in IRAs and taxable pensions; 3% in business income pensions; 2% in Social Security; and the remaining 3% from other sources. Develop a pie chart depicting the breakdown of adjusted gross income. Write a paragraph summa- rizing the information.

42. A recent study of home technologies reported the number of hours of personal computer usage per week for a sample of 60 persons. Excluded from the study were people who worked out of their home and used the computer as a part of their work.

9.3 5.3 6.3 8.8 6.5 0.6 5.2 6.6 9.3 4.3 6.3 2.1 2.7 0.4 3.7 3.3 1.1 2.7 6.7 6.5 4.3 9.7 7.7 5.2 1.7 8.5 4.2 5.5 5.1 5.6 5.4 4.8 2.1 10.1 1.3 5.6 2.4 2.4 4.7 1.7 2.0 6.7 1.1 6.7 2.2 2.6 9.8 6.4 4.9 5.2 4.5 9.3 7.9 4.6 4.3 4.5 9.2 8.5 6.0 8.1

a. Organize the data into a frequency distribution. How many classes would you sug- gest? What value would you suggest for a class interval?

b. Draw a histogram. Describe your result. 43. Merrill Lynch recently completed a study regarding the size of online investment

portfolios (stocks, bonds, mutual funds, and certificates of deposit) for a sample of cli- ents in the 40 up to 50 years old age group. Listed following is the value of all the in- vestments in thousands of dollars for the 70 participants in the study.

$669.9 $ 7.5 $ 77.2 $ 7.5 $125.7 $516.9 $ 219.9 $645.2 301.9 235.4 716.4 145.3 26.6 187.2 315.5 89.2 136.4 616.9 440.6 408.2 34.4 296.1 185.4 526.3 380.7 3.3 363.2 51.9 52.2 107.5 82.9 63.0 228.6 308.7 126.7 430.3 82.0 227.0 321.1 403.4 39.5 124.3 118.1 23.9 352.8 156.7 276.3 23.5 31.3 301.2 35.7 154.9 174.3 100.6 236.7 171.9 221.1 43.4 212.3 243.3 315.4 5.9 1,002.2 171.7 295.7 437.0 87.8 302.1  268.1  899.5

a. Organize the data into a frequency distribution. How many classes would you sug- gest? What value would you suggest for a class interval?

b. Draw a histogram. Financial experts suggest that this age group of people have at least five times their salary saved. As a benchmark, assume an investment portfolio of $500,000 would support retirement in 10–15 years. In writing, summarize your results.

48 CHAPTER 2

44. A total of 5.9% of the prime-time viewing audience watched shows on ABC, 7.6% watched shows on CBS, 5.5% on Fox, 6.0% on NBC, 2.0% on Warner Brothers, and 2.2% on UPN. A total of 70.8% of the audience watched shows on other cable net- works, such as CNN and ESPN. You can find the latest information on TV viewing from the following website: http://www.nielsen.com/us/en/top10s.html/. Develop a pie chart or a bar chart to depict this information. Write a paragraph summarizing your findings.

45. Refer to the following chart:

Contact for Job Placement at Wake Forest University

Networking and

Connections 70%

On-Campus Recruiting

10%

Job Posting Websites

20%

a. What is the name given to this type of chart? b. Suppose that 1,000 graduates will start a new job shortly after graduation. Estimate

the number of graduates whose first contact for employment occurred through net- working and other connections.

c. Would it be reasonable to conclude that about 90% of job placements were made through networking, connections, and job posting websites? Cite evidence.

46. The following chart depicts the annual revenues, by type of tax, for the state of Georgia. 

Sales 44.54%Income

43.34%

Other 0.9%

License 2.9%

Corporate 8.31%

Annual Revenue State of Georgia

a. What percentage of the state revenue is accounted for by sales tax and individual income tax?

b. Which category will generate more revenue: corporate taxes or license fees? c. The total annual revenue for the state of Georgia is $6.3 billion. Estimate the amount

of revenue in billions of dollars for sales taxes and for individual taxes.

DESCRIBING DATA: FREQUENCY TABLES, FREQUENCY DISTRIBUTIONS, AND GRAPHIC PRESENTATION 49

47. In 2014, the United States exported a total of $376 billion worth of products to Canada. The five largest categories were:

Product Amount

Vehicles $63.3 Machinery 59.7 Electrical machinery 36.6 Mineral fuel and oil 24.8 Plastic 17.0

a. Use a software package to develop a bar chart. b. What percentage of the United States’ total exports to Canada is represented by the

two categories “Machinery” and “Electrical Machinery”? c. What percentage of the top five exported products do “Machinery” and “Electrical

Machinery” represent? 48. In the United States, the industrial revolution of the early 20th century changed

farming by making it more efficient. For example, in 1910 U.S. farms used 24.2 million horses and mules and only about 1,000 tractors. By 1960, 4.6 million tractors were used and only 3.2 million horses and mules. An outcome of making farming more efficient is the reduction of the number of farms from over 6 million in 1920 to about 2.2 million farms today. Listed below is the number of farms, in thousands, for each of the 50 states. Summarize the data and write a paragraph that describes your findings.

50 12 5 28 59 19 35 22 80 5 8 48 3 75 25 77 46 68 10 69 77 25 13 20 35 6 52 61 36 38 88 1 75 246 59 50 44 98 74 2 32 42 7 31 28 9 8 44 25 37

49. One of the most popular candies in the United States is M&M’s produced by the Mars Company. In the beginning M&M’s were all brown. Now they are produced in red, green, blue, orange, brown, and yellow. Recently, the purchase of a 14-ounce bag of M&M’s Plain had 444 candies with the following breakdown by color: 130 brown, 98 yellow, 96 red, 35 orange, 52 blue, and 33 green. Develop a chart depicting this information and write a paragraph summarizing the results.

50. The number of families who used the Minneapolis YWCA day care service was recorded during a 30-day period. The results are as follows:

31 49 19 62 24 45 23 51 55 60 40 35 54 26 57 37 43 65 18 41 50 56 4 54 39 52 35 51 63 42

a. Construct a cumulative frequency distribution. b. Sketch a graph of the cumulative frequency polygon. c. How many days saw fewer than 30 families utilize the day care center? d. Based on cumulative relative frequencies, how busy were the highest 80% of the days?

D A T A A N A L Y T I C S

51. Refer to the North Valley Real Estate data that reports information on homes sold during the last year. For the variable price, select an appropriate class interval and orga- nize the selling prices into a frequency distribution. Write a brief report summarizing your findings. Be sure to answer the following questions in your report. a. Around what values of price do the data tend to cluster? b. Based on the frequency distribution, what is the typical selling price in the first class?

What is the typical selling price in the last class?

50 CHAPTER 2

c. Draw a cumulative relative frequency distribution. Using this distribution, fifty percent of the homes sold for what price or less? Estimate the lower price of the top ten percent of homes sold. About what percent of the homes sold for less than $300,000?

d. Refer to the variable bedrooms. Draw a bar chart showing the number of homes sold with 2, 3, 4 or more bedrooms. Write a description of the distribution.

52. Refer to the Baseball 2016 data that report information on the 30 Major League Baseball teams for the 2016 season. Create a frequency distribution for the Team Salary variable and answer the following questions. a. What is the typical salary for a team? What is the range of the salaries? b. Comment on the shape of the distribution. Does it appear that any of the teams have

a salary that is out of line with the others? c. Draw a cumulative relative frequency distribution of team salary. Using this distribu-

tion, forty percent of the teams have a salary of less than what amount? About how many teams have a total salary of more than $220 million?

53. Refer to the Lincolnville School District bus data. Select the variable referring to the number of miles traveled since the last maintenance, and then organize these data into a frequency distribution. a. What is a typical amount of miles traveled? What is the range? b. Comment on the shape of the distribution. Are there any outliers in terms of miles

driven? c. Draw a cumulative relative frequency distribution. Forty percent of the buses

were driven fewer than how many miles? How many buses were driven less than 10,500 miles?

d. Refer to the variables regarding the bus manufacturer and the bus capacity. Draw a pie chart of each variable and write a description of your results.