BADM Graphs

profileBabyTazz21
ExcelManualChartsandGraphs.docx

2-2 Charts and Graphs

Chapter 2 2-21

Charts and Graphs

Graphic displays are essential to understand and interpret complex sets of data in order to be able to make business decisions easier. A first step in exploring and analyzing data is to reduce data to a graphic picture that is clear, concise and consistent with the message of the original data. In this chapter, guidelines are provided for selecting appropriate graphical representations for data sets. Charts and graphs discussed in Chapter 2 include the histogram, frequency polygon, ogive, and stem-and-leaf plot as well as the scatter plot for two variable numerical data. In addition, displays for quantitative data such as bar charts, pie charts, and Pareto charts are created and described.

Data Analysis in Excel

Many of the statistical techniques presented in this text can be performed in Excel using a tool called Data Analysis. To access this feature, select the Data tab along the top of an Excel worksheet. If the Data Analysis feature has been uploaded into your Excel package, it will be found in the Analysis section at the top right of the Data tab page. If Data Analysis does not appear in the Analysis section, it must be added.

To add in Data Analysis:

1. Click on the File tab (leftmost tab at top of page) and Options located in the drop down menu.

2. In the Excel options dialog box, click on Add-Ins next to the bottom of the left menu.

3. Click on Analysis ToolPak near the top of the dialog box and then click on Go… at the bottom.

4. In the dialog box Add Ins, check the box to the left of Analysis ToolPak and click OK. Your Data Analysis feature is now uploaded onto your computer, and you won't have to add it again.

5. Now you can bring up the Analysis ToolPak feature at any time by going to the Data in the ribbon at the top of the Excel worksheet and clicking on Data Analysis. You may also be interested in adding the Solver Add-in for particular decision making analysis.

2.1 Frequency Distributions

The process of exploratory data analysis usually starts with graphic displays accompanied by numerical analysis. It is helpful to examine each variable individually and then study the relationship between variables. Counts and percentages of variables in a category are useful for comparison. Raw data that has not been summarized in any way can be referred to as ungrouped data. Data that has been organized into a frequency distribution are called grouped data. A frequency distribution is a summary of data presented in the form of class intervals and frequencies. Frequency distributions are relatively easy to construct. The textbook describes the manual method of construction by determining the midpoint and intervals of the data. Excel can be helpful in constructing frequency distributions once you understand the method of construction.

Demonstration Problem 2.1

The following data are the average monthly 30-year fixed mortgage interest rates for a 40-month period. Construct a frequency distribution for these data. Calculate and display the class midpoints, relative frequencies, and cumulative frequencies for this frequency distribution.

5.06

4.89

4.75

4.11

3.81

4.95

4.74

4.95

4.07

3.69

4.88

4.56

4.83

3.99

3.55

4.88

4.43

4.84

3.95

3.60

5.05

4.35

4.64

3.92

3.50

4.99

4.24

4.51

3.89

3.38

4.97

4.30

4.54

3.95

3.35

5.10

4.71

4.27

3.91

3.34

1. Input the data into a column in Excel or open the Excel file Demo 2-1 from the student website.

2. Determine how many classes this frequency distribution should contain: First find the Range: the range is defined as the difference between the largest and smallest numbers. It would be helpful to sort the data to make it easier to locate those values. You can sort the data by clicking on the first cell of data in a column and selecting Data Sort from the ribbon. The data cells in that column will automatically be selected and the following dialog box will appear. Click OK and the data will be sorted smallest to largest.

3. You can use a calculator or input a simple formula in Excel to subtract the smallest value from the largest value to determine the range. To use Excel, click on a cell to the right of the data set, type =, click on the cell with the largest value (5.10), type a minus sign (-) and then click on the cell with the smallest value (3.34) and Enter. The result is 1.76. If you click back on the cell where you input the formula, you should see a formula like in the formula bar, showing you the cells that you clicked on.

4. Find the class width: determine what size interval or how many classes or intervals that you want to use for this particular data set. In this case, if 8 classes are used, the class width is calculated by dividing the range (in this case, 1.76) by the desired number of classes − 8. The answer is 0.22 and give us an idea of what we could use for an interval. A rounded off number close to this value could be 0.20 or 0.25 and rounded values work better in a table or graphic. Let’s use 0.25 for this class width.

5. In Excel, data are input in an interval up to and including each interval value. The first class must include 3.34 to include the smallest value. In addition, the last endpoint must include 5.10 to include the largest value. We can start with the interval 3.25 to 3.50 (which includes the smallest value of 3.34) and end with 5.00 to 5.25 (which includes the largest value of 5.10). If we do not want to include 3.50 in an interval as the endpoint, we should say that the endpoint is just less than 3.50, say 3.49. The reason why this is important is to prevent 3.50 from being included in the previous interval when it was meant to be the starting point of the next interval. So, in this case, that would mean that the first interval would actually be 3.25 to 3.49, the second interval would be 3.50 to 3.74, etc. Input the bin endpoints in a separate column in your Excel worksheet. The output we generate in Excel could differ slightly from output generated in another statistical program that may be shown in your text that interprets the intervals differently. The resulting interval beginning and endpoints for Excel would be:

This essentially means that the end points are not quite the same number as the beginning of the next bin. This is important in Excel because you shouldn’t have the same number as both the beginning and endpoints of an interval and numbers are place in bins if equal to or less than the endpoint.

6. Determine how many data points go into each interval by using the Data Analysis capability in the Analysis Toolpak in Excel.

Note on Analysis Toolpak:

Check to see if the Analysis Toolpak has been installed by selecting the Data tab to see if there is a Data Analysis button on the right side of the ribbon. If you do not see it, select File Options Add-ins. At the bottom of the dialog box, you will see Manage Excel Add-ins. Click on the Go... button. Check the boxes for the Analysis Toolpak and the Solver Add-ins and OK. You should now see the Data Analysis and Solver options.

7. Select the Data tab and then Data Analysis on the right side of the ribbon. Select Histogram and OK. Select the data for the Input Range and the interval sizes for the Bin Range which should be the endpoints of the bins. Select an output cell that is to the right of the data. Check Cumulative Percentage and OK. The result should display the frequency or count of data points in each interval and then the cumulative percentage. Remember that this display could be slightly different than your text because of the way Excel interprets the bin intervals by using the rule – “up to and including.” It often does not work to select the first number as the first bin. Instead, use the next larger rounded number.

Histo frequency

The resulting frequency distribution in Excel is shown below:

9. The class midpoints can be calculated by using a simple formula in the column to the right of the interval (can insert a column if necessary by rt-clicking on the column heading and selecting Insert). See the formula as shown below. The midpoint formula represents the bin beginning number + half of the difference between the bin end and the bin beginning. This formula can be copied down to the remainder of the rows of data. This is done by clicking and dragging from the lower right corner of the cell with the formula. Another method used to copy a formula is to select the cells that will have the calculation, type the first formula, and then type Ctrl-Shift-Enter to see the results.

10. To add a Cumulative Frequency, insert a column to the right of the Frequency column (by rt-clicking on the column heading and selecting Insert). The first cell in that column is the first frequency number from the column to the left. A formula can be inserted into the second cell by starting with an = and then clicking on the first cumulative frequency number from the cell above, then inserting a + and then clicking on the second frequency number as shown below. This formula can be copied down to the remainder of the rows of data.

11. For a Relative Frequency calculation, select the cells where the relative frequency values will be input and type = and then click on the cell with the first frequency number (3), type a / for ÷ and then click on the cell with the total cumulative frequency (40). Select the function key F4 to make the total number absolute (you will see $ before both the row number and column letter). Type Ctrl-Shift-Enter to see the resulting numbers. You can change the format of the numbers by selecting the cells to change and clicking on the decrease decimal button in the ribbon until the numbers have 3 decimal places. The resulting data can be compared with the text.

The frequencies and relative frequencies of these data reveal the mortgage interest rate classes that are likely to occur during this period of time. Overall, the mortgage rates are distributed relatively evenly with the 4.75-under 4.00 class interval containing the greatest frequency (10) followed by the 3.75-under 4.00 class interval (7), and the 4.50-under 4.75 interval (6).

2.2 Quantitative Data Graphs

Converting data to graphics can be creative and artful. One of the most important uses of graphical depiction in statistics is to help the researcher determine the shape of a distribution. Data graphs can generally be classified as quantitative or qualitative. Quantitative data graphs are plotted along a numerical scale, and qualitative graphs are plotted using non-numerical categories. In this section, we will examine five types of quantitative data graphs: (1) histogram, (2) frequency polygon, (3) ogive, (4) dot plot, and (5) stem-and-leaf plot.

Histograms

One of the more widely used types of graphs for quantitative data is the histogram. The distribution of a single quantitative variable can be uniquely illustrated by a histogram, a customized bar graph with values grouped together by equal intervals. A histogram has intervals charted on the x axis that are of equal width. Data values are placed into appropriate bins or groupings much like sorting coins. The count of how many data points are in each bin is graphed on the y axis.

When constructing a histogram with Excel, you need to define your own bin sizes. Otherwise, if you let Excel do this step for you, you could end up with some odd looking bin sizes with lots of decimal places. The recommendation is to create the bin intervals the same way that is outlined in the text and in our first example on frequency distributions. Key elements: use the range of data divided by the number of intervals desired. This gives you an idea of the bin sizes and you can round the values from there.

Let's create a histogram in Excel by using the mortgage rate data set from Demo Problem 2.1.

Demonstration Problem 2.1

1. Input the data into a column in Excel or open the Excel file Demo 2-1 from the student website.

2. Create bin sizes if you didn't do this for the previous exercise (the range for this data set is 1.76. Divide this by the desired number of intervals which is 8. The result is 0.22. We round up to using 0.25 as a bin size). Our first bin in the previous exercise started with 3.25 and ended with 3.499, slightly less than 3.50. 3.50 was the beginning of the second bin. If we use 3.50 as the beginning of the first bin, it will include the lowest value of 3.34 but it will also include 3.50 because Excel includes all numbers up to and including the bin value. In order to make the histogram axis values look better, it is more desirable to use the rounded values, such as 3.50, than 3.49. Put the bin values in their own column.

In Excel, data values are placed into bins by checking if the value is less than or equal to the bin value. This may be different in other software programs and, therefore, the numbers in specific bins may be different as well.

3. Select the Data tab and then Data Analysis on the right side of the ribbon. If you don't see that option, you will need to install the Analysis Toolpak as described in the previous exercise (select File Options Add-ins. At the bottom of the dialog box, you will see Manage Excel Add-ins. Click on the Go... button. Check the boxes for the Analysis Toolpak and OK).

4. Select Data Analysis Histogram and click OK.

5. The Histogram dialog box requires several inputs. Click inside the Input Range box and select the data set without the label.

How to Select Data: Click on the first cell of the data set, hold down the left mouse button and drag the cursor down to the last cell and release the mouse button. If the dialog box is obstructing the data, there are two methods for accessing the data. You can move the dialog box by clicking and dragging the title bar, and moving the dialog box to another location. You can also click on the button to the right of the box for Input Range or other input boxes like this. The dialog box will shrink to the size of the input box. When you are done selecting the data range, click on the box to the right of the input box to return to the dialog box and complete the selections.

histo box

Click on button to select data on worksheet

Click on button to return to dialog box

6. Excel Select the values for the Bin Range in the same way, not including the label.

7. Click inside the Output Range box under the Output Options. Verify that the cursor is inside the box. Otherwise, a previously selected range could be changed. Select a cell to the right of the data set such as cell B1. This cell will be the upper left cell of the output placed on the worksheet.

8. Select Chart Output to create the histogram. Click OK.

9. The resulting histogram still requires several changes. First, eliminate the More category by selecting the histogram chart by clicking on it once, then drag the lower-right blue box (at the bottom of the Frequency column) up one line.

10. Delete the legend by clicking on it once and Delete using the keyboard.

11. Eliminate the gaps between the bars by double-clicking on any bar or rt-clicking and selecting Format Data Series. On the right side of the worksheet, a menu option is displayed for Format Data Series. Change the Gap width to 0% and Close using the ×.

12. You will notice that the bars are now together but there is no distinction between bars. We can fix this by selecting a dark color for a border. Either rt-click or double click to activate the Format Data Series menu again on the right. Select Series Option Fill & Line. Under Border, select Solid line and Black for the Color.

13. Resize the chart and change the title and labels by clicking inside the text boxes and typing the new text.

Remarks

The distribution of mortgage rates does not show a clear shape. The primary peak is at the 5.00% bin and decreases to the lower rates with another peak at 4.00%. The distribution is somewhat left-skewed. The spread is 3.34% to 5.10% (the spread is identified using exact values, not the bin designations). The shape of the distribution can change somewhat but using fewer or more bins.

For example if the first bin ends at 3.40 and the bin sizes are 0.40, there are fewer bins and the shape of the distribution is more distinct. The left skew is more pronounced.

Using the Kaddstat addin to create a histogram:

1. Using the data set input in Excel (or open the Excel file Demo 2-1 from the student website), select Addins Kiddstat. If the Addin isn’t there, use the procedures in Chapter 1 to install it.

2. Select Histograms and select the data range and the size of the class width as 0.4.

3. The output chart can be formatted as in the previous steps using Data Analysis in Excel.

Frequency Polygons

A frequency polygon, like the histogram, is a graphical display of class frequencies. In a frequency polygon, each class frequency is plotted as a dot at the class midpoint, and the dots are connected by a series of line segments. Connecting these midpoint dots completes the graph. The information obtained from frequency polygons and histograms is similar.

Let's create a frequency polygon in Excel by using the mortgage rate data set from Demo Problem 2.1.

1. Open the Excel file Demo_2-1 or open the Excel file Demo 2-1 from the student website if not opened already.

2. The required data to create the frequency polygon is the frequency counts and the class midpoints (the middle value in the interval). The mortgage data looks like the following from the first exercise:

3. Select the Midpoint and Frequency data together (if columns are separated, select the first column of data, hold the Ctrl key down and select the second column of data). Excel assumes that the first selected column is the x-axis series and the second selected column is the y-axis series. If this is not what you want you can switch the data manually in Excel or Format Data Series and switch the axes once the chart is created.

4. Then click on the Insert tab, then click on the down arrow to the right of Line (graph) and then More Line Charts....

Click on the down arrow to access the types of line graphs available

5. You will see the choices of line graphs available. Select the one that shows only one line with Midpoints on the x-axis and Frequency for the y-axis. Click OK.

The resulting graph will need to be resized and labels and title added.

6. The frequency polygon requires that the data points are marked by dots of some type. You can change the layout of the line by either double-clicking on it or by rt-clicking on the line and selecting Format Data Series. The Format Data Series menu appears on the right side of the worksheet.

7. Select Marker Marker Options. Select the type and size of marker you want. Close out the menu screen on the right side.

8. To add axis labels, click on the chart once, and click on the + sign displayed and select Axis Titles. Axis Title boxes appear and you can click inside the box to delete and add text.

9. Change the x-axis label to Midpoints and the y-axis label to Frequency. Click inside the title box and insert the title Frequency Polygon for Mortgage Rates.

10. Delete the chart lines by selecting the lines and clicking the delete key. Your resulting frequency polygon should look like the following:

Ogives

An ogive is a cumulative frequency polygon. The x-axis is labeled with the bin class endpoints and the y-axis with the frequencies. Ogives are most useful when the decision maker wants to see running totals.

Steep slopes in an ogive can be used to identify sharp increases in frequencies.

Let's create an ogive in Excel by using the mortgage rate data set from Demo Problem 2.1.

1. Open the Excel file you created with the data from Demo problem 2.1 or input the data into a column in Excel or open the Excel file Demo 2-1 from the student website.

2. The required data to create the ogive is the cumulative frequency counts and the class endpoints (the end values in the interval). The mortgage data from the first exercise looks like the following:

3. Create a line graph in the same way as the previous exercise graphing the frequency polygon. Select the Bin and Cumulative Frequency data together (if columns are separated, select the first column of data, hold the Ctrl key down and select the second column of data). Excel assumes that the first selected column is the x-axis series and the second selected column is the y-axis series. If this is not what you want you can switch the data manually in Excel or Format Data Series and switch the axes once the chart is created.

4. Then click on the Insert tab, then click on the down arrow to the right of Line (graph) and then More Line Charts....

Click on the down arrow to access the types of line graphs available

5. You will see the choices of line graphs available. Select the one that shows only one line with Midpoints on the x-axis and Frequency for the y-axis. Click OK.

The resulting graph will need to be resized and labels and title added.

6. The ogive graph requires that the data points are marked by dots of some type. You can change the layout of the line by either double-clicking on it or by rt-clicking on the line and selecting Format Data Series. The Format Data Series menu appears on the right side of the worksheet.

7. Select Marker Marker Options. Select the type and size of marker you want. Close out the menu screen on the right side.

8. To add axis labels, click on the chart once, and click on the + sign displayed and select Axis Titles. Axis Title boxes appear and you can click inside the box to delete and add text.

9. Change the x-axis label to Class Endpoints and the y-axis label to Frequency. Click inside the title box and insert the title Ogive for Mortgage Rates.

10. Delete the chart lines by selecting the lines and clicking the delete key. Your resulting frequency polygon should look like the following:

Remarks

A particularly steep slope occurs in the 4.75–under 5 class, signifying a large jump in class frequency totals.

Dot Plots

A simple statistical chart that is generally used to display continuous, quantitative data is the dot plot. In a dot plot, each data value is plotted along the horizontal axis and is represented on the chart by a dot. If multiple data points have the same values, the dots will stack up vertically. There is no direct way to create a dot plot in Excel. However, a simple representation can be created using a function.

1. Open the Excel file Demo_2-1 or input the data from that problem into a column in Excel.

2. The required data for a dot plot are frequencies for each value. The data in the required format is shown as follows:

3. In the cell to the right of the first frequency, input the repeat function which repeats the given symbol the number of times specified in the count. =REPT("•",A43) or whichever cell represents the first frequency 1. Copy the formula down.

4. The dot plot shows a simple sketch of the distribution.

Stem-and-Leaf Plots

Another way to organize raw data into groups besides using a frequency distribution is a stem-and-leaf plot. There is no direct way to create this plot using Excel.

2.3 Qualitative Data Graphs

Qualitative graphs are plotted using non-numerical categories. In this section, we will examine three

types of qualitative data graphs: (1) pie charts, (2) bar charts, and (3) Pareto charts.

Pie Charts

A pie chart is a circular depiction of data where the area of the whole pie represents 100% of the data and slices of the pie represent a percentage breakdown of the sublevels. Pie charts show the relative magnitudes of the parts to the whole. They are widely used in business, particularly to depict such things as budget categories, market share, and time/resource allocations. Construction of the pie chart begins by determining the proportion of the subunit to the whole.

Pie Chart Example

The annual sales for the top petroleum refining companies in the United States in a recent year is given as:

Exxon Mobil $453,123

Chevron 253,706

Conoco Phillips 251,226

Valero Energy 125,987

Total $1,084,042

Construct a pie chart and a bar chart to depict these data.

1. Input the data into an Excel worksheet.

2. Select all of the data and labels, not including the Total.

3. Select the Insert tab and click on the down arrow for the Pie chart layout.

Click on the down arrow to access the types of line graphs available

4. Select the 3-D Pie chart.

5. To change the format of the pie graph, we can show the value for each pie slice by clicking once on the pie graph and then the + to the right and check Data Labels. This adds the actual value of each piece of pie. If we want to see what percentage of each pie slice, double-click on the pie graph or rt-click and select Format Data Labels and the menu appears to the right of your worksheet. You have a variety of options to choose for your pie graph.

6. You can click on the chart and go back to the chart options in the ribbon menu to select other formats, such as the 2-D pie chart as shown. In addition, you double-click on the numbers and change the formatting.

7. You can also change the labels to be next to or inside the pie slices by experimenting with the Format Data Labels options.

Demonstration Problem 2.3

According to the National Retail Federation and Center for Retailing Education at the University of Florida, the four main sources of inventory shrinkage are employee theft, shoplifting, administrative error, and vendor fraud. The estimated annual dollar amount in shrinkage ($ millions) associated with each of these sources are:

Employee theft $17,918.6

Chevron 15,191.9

Conoco Phillips 7,617.6

Valero Energy 2,553.6

Total $43,281.7

We can construct a pie chart and a bar chart to depict these data.

1. Input the data into an Excel worksheet.

2. Select all of the data and labels, not including the Total.

3. Select the Insert tab and click on the down arrow for the Pie chart layout.

Click on the down arrow to access the types of pie graphs available

Chart Enhancements

Once a chart has been constructed, a set of three new tabs appear at the top of the worksheet under the general area of Chart Tools (see top upper right corner of worksheet). The three new tabs are Design, Layout, and Format. There are many options available for changing the design of the pie chart that can

be accessed by clicking on the up and down arrow on the right end of the Design tab in the section called Chart Styles.

Bar Chart

Another widely used qualitative data graphing technique is the bar graph or bar chart. A bar graph or chart contains two or more categories along one axis and a series of bars, one for each category, along the other axis. Typically, the length of the bar represents the magnitude of the measure (amount, frequency, money, percentage, etc.) for each category. The bar graph is qualitative because the categories are non-numerical, and it may be either horizontal or vertical. In Excel, horizontal bar graphs are referred to as bar charts, and vertical bar graphs are referred to as column charts. A bar graph generally is constructed from the same type of data that is used to produce a pie chart. However, an advantage of using a bar graph over a pie chart for a given set of data is that for categories that are close in value, it is considered easier to see the difference in the bars of bar graph than discriminating between pie slices.

Let's create a bar chart from the data previously used in Demonstration Problem 2.3.

Demonstration Problem 2.3, continued

According to the National Retail Federation and Center for Retailing Education at the University of Florida, the four main sources of inventory shrinkage are employee theft, shoplifting, administrative error, and vendor fraud. The estimated annual dollar amount in shrinkage ($ millions) associated with each of these sources are:

Employee theft $17,918.6

Shoplifting 15,191.9

Administrative error 7,617.6

Vendor fraud 2,553.6

Total $43,281.7

1. Input the data into an Excel worksheet.

pie data

2. Select all of the data and labels, not including the Total.

3. Use the previous method to generate a 2-D pie chart with labels outside of the pie slices. Select the Insert tab and click on the down arrow for the Pie chart layout.

Click on the down arrow to access the types of pie graphs available

4. Select the 2-D Pie chart.

5. To change the format of the pie graph, we can show the value for each pie slice by clicking once on the pie graph and then the + to the right and check Data Labels. This adds the actual value of each piece of pie. If we click on the arrow to the right of the Data Labels option, we can see all of the options of the manipulation of the labels.

6. Select Category Name, Percentage, and Show Leader Lines as shown below in the options. You can click on each value label and drag it away from the pie showing the leader lines. You have a variety of options to choose for your pie graph.

7. Click on the legend in the chart and then hit the Delete key. The Chart Title can also be deleted.

8. To display one decimal place for the pie percentage, go back to Format Data Labels and Number. Select Percentage with one decimal place.

9. Select the Insert tab and select Bar and the clustered Bar layout.

10. Right-click on the legend and select Delete. Right-click on the gridlines and select Delete.

bar graph

11. The bar chart is pictured with low to high values. If the opposite is desired, that is high to low, the data can be sorted and regraphed.

bar graph 2

11. To change the format of the bar graph, right-click on the chart and select Change Chart type... Select the first column graph, the clustered column.

pareto

The resulting bar chart is called a Pareto chart. Pareto analysis is a quantitative tallying of the number and types of defects that occur with a product or service. Analysts use this tally to produce a vertical bar chart that displays the most common types of defects, ranked in order of occurrence from left to right. If your graph goes low to high, re-sort the data and re-graph.

You can change the format of all graphs using the options provided for color, labels, titles, and other format changes.

2.4 Charts and Graphs for Two Variables

It is very common in business statistics to want to analyze two variables simultaneously in an effort to gain insight into a possible relationship between them. Business statistics has many techniques for exploring such relationships. Two of the more elementary tools for observing the relationships

between two variables are cross tabulation and scatter plot.

Cross Tabulation

Cross tabulation is a process for producing a two-dimensional table that displays the frequency counts for two variables simultaneously. There is no option to create a cross tab table in Excel.

Scatter Plot

A scatter plot is a two-dimensional graph plot of pairs of points from two numerical variables. The scatter plot is a graphical tool that is often used to examine possible relationships between two variables.

We are going to create a scatter plot using the data in Table 2.10.

Table 2.10 Scatterplot

Displayed are the values of new residential and new nonresidential buildings in the United States for various years over a 35-year period.

1. Open the Excel data file Table_2-10 from the student website.

2. Select both columns of data and select Insert tab Scatter (graph) Scatter with only markers.

3. Right-click on the legend and select Delete. Right-click on the gridlines and select Delete.

scatterplot

4. It is better to change the minimum values for both axes so there is not as much blank space and you can focus on the actual data points. To do this, right-click anywhere on the x-axis and select Format Axis. Select the first Fixed option and input 120000 for the Minimum value. Do the same thing for the y-axis and 80000 for the Minimum value.

format axis

5. Edit the chart title by clicking in the title box and typing "Scatter Plot of New Residential and Nonresidential Construction".

6. Click once on the chart and select the Chart Tools tab that appears. Select Layout 1 for a chart layout. Right-click and delete the legend and gridlines. Input "Residential" for the x-axis label and "Nonresidential" for the y-axis label.

Scatterplot finish

Remarks

Do these two numerical variables exhibit any relationship? It might seem logical when new construction booms that it would boom in both residential building and in nonresidential building at the same time. However, the scatter plot of these data shows somewhat mixed results. The apparent tendency is that more new residential building construction occurs when more new nonresidential building construction is also taking place and less new residential building construction when new nonresidential building

construction is also at lower levels. The scatter plot also shows that in some years more new residential building and less new nonresidential building happened at the same time, and vice versa.

SUMMARY OF EXCEL COMMANDS USED IN CHAPTER 2

Creating Charts & Graphs (General)

· Click on the Insert tab found along the top of an Excel worksheet. You can construct many different types of charts, including column charts, line charts, pie charts, bar charts, area charts, and XY (scatter) charts.

· Excel can generate frequency distributions and histograms using the Data Analysis feature.

Data Analysis Tool

Select the Data tab along the top of an Excel worksheet. If the Data Analysis feature has been uploaded into your Excel package, it will be found in the Analysis section at the top right of the Data tab page. If Data Analysis does not appear in the Analysis section, it must be added in.

To add in Data Analysis:

1.) Click on the File tab.

2.) Click on options in the menu.

3.) In the Excel options dialog box, click on Add-Ins next to the bottom of the left menu. A screen of add-ins will appear.

4.) Click on Analysis ToolPak and then click on Go… at the bottom of the page.

5.) In the dialog box Add Ins, check the box to the left of Analysis ToolPak and click OK. Your Data Analysis feature is now uploaded onto your computer, and you won't need to add it in again. You can bring up the Analysis ToolPak feature at any time by going to the Data tab at the top of the Excel worksheet and clicking on Data Analysis.

Constructing Frequency Distributions (Histograms)

In Excel, frequency distributions are referred to as histograms, and the classes of a frequency distribution are referred to as bins. If you do not specify bins (classes), Excel will automatically determine the number of bins and assign class endpoints based on a formula. If you want to specify bins, load the class endpoints that you want to use into a column.

· Select the Data tab in the Excel worksheet and then select the Data Analysis feature (upper right). If this feature does not appear, you may need to add it (see above).

· Click on Data Analysis, the dialog box features a pulldown menu of many of the statistical analysis tools presented and used in this text. From this list, select Histogram.

· In the Histogram dialog box, click in the space beside Input Range and select the raw data values.

· Place the location place the location of the raw data values of the class endpoints (optional) in the space beside Bin Range. Leave this blank if you want Excel to determine the bins (classes).

· If you have labels, check Labels. If you want a histogram graph, check Chart Output. If you want an ogive, select Cumulative Percentage along with Chart Output. If you opt for this, Excel will yield a histogram graph with an ogive overlaid on it.

Creating Charts

· Select the Insert tab from the top of the Excel worksheet.

· In the Charts section, which is the middle section shown at the top of the Insert worksheet, there are icons for column, line, pie, bar, area, scatter, and other charts. Click on the icon representing the desired chart to begin construction. Each of these types of charts allow for several versions of the chart shown in the dropdown menu. For example, the pie chart menu contains four types of two-dimensional pie charts and two types of three-dimensional pie charts. To select a particular version of a type of chart, click on the type of chart and then the version of that chart that is desired.

Pie Chart

· To construct a pie chart, enter the categories in one column and the data values of each category in another column in the Excel worksheet. Categories and data values could also be entered in rows instead of columns.

· Click and drag over the data for which the pie chart is to be constructed.

· From the Insert tab, select Pie from the Charts section and the select the type of pie chart to be constructed. The result is a pie chart from the data.

· Once the chart has been constructed, a set of three new tabs appear at the top of the worksheet under the general area of Chart Tools (see top upper right corner of worksheet). The three new tabs are Design, Layout, and Format. There are many options available for changing the design of the pie chart that can be accessed by clicking on the up and down arrow on the right end of the Design tab in the section called Chart Styles.

· On the far right end of the Design menu bar is a feature called Move Chart Location, which can be used to move the chart to another location or to a new sheet.

· On the far left end of the Design menu bar, there is a Change Chart Type feature that allows for changing the type of chart that has been constructed.

· The second group of features from the left at the top of the Design menu bar makes it possible to switch to another set of data (Select Data) or switch rows and columns (Switch Row/Column).

· In the middle of the Design menu bar, Chart Layouts offers several different layouts for the given chart type. For example, for pie charts, there are seven different possible layouts using titles, labels, and legends in different ways.

· Right-clicking on the pie chart brings up a menu that includes Add Data Labels and Format Data Series… Clicking on Add Data Labels adds data labels to your pie chart. Another option has the feature Format Data Labels… Clicking on this feature brings up another menu that allows you to modify or edit various features of your graph, including Label Options, Number, Fill, Border Color, Shadow, Glow and Soft Edges, 3-D Format, and Alignment. Under Label Options, there are several different types of labels for pie charts and there are other various chart options available, such as Series Name, Category Name, Value, Percentage, and Show Leader Lines. In addition, it offers various options for the label location, such as Center, Inside End, Outside End, and Best Fit. It also offers the opportunity to include the legend key in the label. The Number option under Format Data Labels… allows for the usual Excel options in using numbers. The Fill option allows you to determine what type of fill you want to have for the chart. Options include No fill, Solid fill, Gradient fill, Picture or texture fill, and Automatic. Other options under Format Data Labels… allow you to manipulate the border colors and styles, shadow, Soft and Glow Edges, and 3-D format, and text alignment or layout. The Layout tab at the top of the worksheet page has a Labels panel located at the top of the worksheet page just to the left of the center. In this section, you can further specify the location of the chart title by selecting Chart Title, the location of the legend by selecting Legend, or the location of the labels by selecting Data Labels.

· The Format tab at the top of the worksheet page contains a Shape Styles panel just to the left of center at the top of the worksheet. This panel contains options for visual styles of the graph (for more options, use the up and down arrow) and options for Shape Fill, Shape Outline, and Shape Effects. Other formatting options are available through the use of the Format Selection option on the far upper left of the Current Selection panel on the Format tab page.

Frequency Polygons

· Frequency polygons can be constructed by using the Histogram feature. Follow the directions shown above to construct a histogram.

· Once the histogram is constructed, right-click on one of the “bars” of the histogram. From the dropdown menu, select Change Series Chart Type. Next select a line chart type. The result will be a frequency polygon.

Ogive Chart

An ogive can be constructed at least two ways.

· One way is to cumulate the data manually. Enter the cumulated data in one column and the class endpoints in another column. Click and drag over both columns. Go to the Insert tab at the top of the Excel worksheet. Select Scatter as the type of chart. Under the Scatter options, select the option with the solid lines. The result is an ogive.

· A second way is to construct a frequency distribution first using the Histogram feature in the Data Analysis tool. In the Histogram dialog box, enter the location of the data and enter the location of the class endpoints as bin numbers. Check Cumulative Percentage and Chart Output in the Histogram dialog box. Once the chart is constructed, right-click on one of the bars and select the Delete option. The result will be an ogive chart with just the ogive line graph (and bars eliminated).

Bar Charts & Column Charts

Bar charts and column charts are constructed in a manner similar to that of a pie chart. Begin by entering the categories in one column and the data values of each category in another column in the Excel worksheet. Categories and data values could also be entered in rows instead of columns. Click and drag over the data and categories for which the chart is to be constructed.

· Go to the Insert tab at the top of the worksheet.

· Select Column or Bar from the Charts section and the select the version of the chart to be constructed. The result is a chart from the data.

· Once the bar chart or column chart has been constructed, there are many options available. By right-clicking on the bars or columns, a menu appears that allows you, among other things, to label the columns or bars. This command is Add Data Labels. Once data labels are added, clicking on the bars or columns will allow you to modify the labels and the characteristics of the bars or columns by selecting Format Data Labels… or Format Data Series….

· Usage of these commands is the same as when constructing or modifying pie charts (see above).Various options are also available under Chart Tools (see pie charts above).

Pareto Charts

· Pareto charts, as presented in the text, have categories and numbers of defects. As such, Pareto charts can be constructed as Column charts in Excel using the same commands (see above). However, the user will first need to order the categories and their associated frequencies in descending order.

· In addition, in constructing a histogram in Excel (see above), there is an option in the Histogram dialog box called Pareto (sorted histogram) in which Excel takes histogram data and presents the data with categories organized from highest frequency to lowest. To do this, you must also check Chart Output in the Histogram dialog box.

Scatter Diagrams

· Enter the data for the two variables to be graphed in two separate rows or columns.

· You can use a label for each variable or not. Click and drag over the data (and labels). Go to the Insert tab. From the Charts panel (upper middle), select Scatter.

· From the pulldown menu of scatter plot options, select one of the versions from the five presented. The result is the scatter chart.

· By right-clicking on the chart, various other chart options are available including, Format Plot Area…. The resulting menu associated with this command offers the usual chart options regarding fill, border color, border styles, shadow, Glow and Soft Edges, and 3-D format (see pie charts above).

· In addition, if you want to fit a line or curve to the data, right-click on one of the chart points. A menu pops up containing, among other options, Add Trendline…. From the Trendline Options, select the type of line or curve that you want to fit to the data. The result is a line or curve shown on the scatter plot attempting to fit to the points. Various other options are available regarding the line color, style, and shadow. You can also display the trend fitting equation.

Frequency 3.3745000000000003 3.6245000000000003 3.8745000000000003 4.1244999999999994 4.3744999999999994 4.6 244999999999994 4.8744999999999994 5.1244999999999994 3 4 7 3 4 6 10 3

2-1