Need help with attached
Copyright ©2015 Pearson Education, Inc. 2-*
Chapter
2
Copyright ©2015 Pearson Education, Inc. 2-*
Chapter
2
Displaying Descriptive Statistics
CHAPTER 2 MAP
2.1 The Role Technology Plays in Statistics
2.2 Displaying Quantitative Data
2.3 Displaying Qualitative Data
2.4 Contingency Tables
2.5 Stem and Leaf Display
2.6 Scatter Plots
Copyright ©2015 Pearson Education, Inc. 2-*
2.1 The Role Technology Plays
in Statistics
Microsoft Excel has built-in options for data presentation and statistical analysis
You may need to activate Excel’s Analysis Tool Pak Add-in to see these options
Copyright ©2015 Pearson Education, Inc. 2-*
Statistical Analysis Using Excel 2013
- Open Excel 2013, then click on the File tab
- Click Options shown in the drop down menu. This will open the Excel Options dialog box
- Select Add-Ins in the left margin…
Copyright ©2015 Pearson Education, Inc. 2-*
Statistical Analysis Using Excel 2010
Click on Go at the bottom of the screen
Select the check boxes for Analysis ToolPak and Analysis ToolPak - VBA in the popup menu and click OK
Copyright ©2015 Pearson Education, Inc. 2-*
Statistical Analysis Using Excel 2013
Select the Data tab. Click on Data Analysis on the right side of the application bar
The Data Analysis pop-up menu should appear in the spreadsheet
Copyright ©2015 Pearson Education, Inc. 2-*
Installing PHStat
PHStat is an Excel Add-in developed by Prentice Hall to provide students with additional features for statistical analysis
- The software will be referred to throughout the book and is available from the book’s website: www.pearsonhighered.com/donnelly
- To install PHStat on your Windows PC, follow the instructions on the book’s website
- Mac users can also find instructions for PHStat on the book’s website
Copyright ©2015 Pearson Education, Inc. 2-*
2.2 Displaying Quantitative Data
Recall the types of data from Chapter 1:
Quantitative
Qualitative
Types of Data
Displaying qualitative data is discussed in section 2.3
Displaying quantitative data is discussed in section 2.2
Copyright ©2015 Pearson Education, Inc. 2-*
Constructing a Frequency Distribution
A frequency distribution shows the number of data observations that fall into specific intervals
- Graphically summarize information not readily observable by merely looking at data in a table
Copyright ©2015 Pearson Education, Inc. 2-*
Constructing a Frequency Distribution
Example: Number of iPads sold per day
Copyright ©2015 Pearson Education, Inc. 2-*
Discrete vs. Continuous Data
Discrete data are values based on observations that can be counted and are typically represented by whole numbers
- represent something that has been counted
- take on whole numbers such as 0, 1, 2, 3
Continuous data are values that can take on any real numbers, including numbers that contain decimal points
- usually measured rather than counted
- Examples are weight, time, and distance
Copyright ©2015 Pearson Education, Inc. 2-*
Discrete vs. Continuous Data
Examples of Discrete data
- Number of children per family
- Number of cars listed per insurance policy
- Vacation days per month
Examples of Continuous data
- Time required to read chapter 2
- Thickness of paint applied to a car body
- Voltage of batteries produced in August
Copyright ©2015 Pearson Education, Inc. 2-*
Relative Frequency Distributions
Relative frequency distributions display the proportion of observations of each class relative to the total number of observations
- shows the fraction of observations in each class
- found by dividing each frequency by the total number of observations
- the fractions in a relative frequency distribution add up to 1.00
Copyright ©2015 Pearson Education, Inc. 2-*
Relative Frequency Distributions
Two iPads were sold on 28% of the days
Example:
Copyright ©2015 Pearson Education, Inc. 2-*
Cumulative Relative Frequency Distributions
A cumulative relative frequency distribution totals the proportion of observations that are less than or equal to the class at which you are looking
- Shows the accumulated proportion as values vary from low to high
Copyright ©2015 Pearson Education, Inc. 2-*
Cumulative Relative Frequency Distributions
Example:
Three iPads or less were sold on 80% of the business days
Copyright ©2015 Pearson Education, Inc. 2-*
Using a Histogram to Graph a Frequency Distribution
A histogram is a graph showing the number of observations in each class of a frequency distribution
- Excel uses the term “bins” for the classes in the distribution
Copyright ©2015 Pearson Education, Inc. 2-*
Constructing a Histogram in Excel
Select the Data tab, and click on Data Analysis in the upper right corner
In the pop-up menu, select Histogram and click OK…
1
1
2
2
Copyright ©2015 Pearson Education, Inc. 2-*
Constructing a Histogram in Excel
In the Input Range text box, highlight the desired data
In the Bin Range text box, highlight the bin values (create bins if not already created before step 1)
For Output options, select New Worksheet Ply and Chart Output
Click OK
3
4
5
6
Copyright ©2015 Pearson Education, Inc. 2-*
Histograms in Excel
7. Customize the Excel graph to make it more attractive
8. Stretch size to better proportion
9. Eliminate “more” bin
10. Modify the graph and axis labels
11. Remove the redundant “Frequency” legend
Copyright ©2015 Pearson Education, Inc. 2-*
The Shape of Histograms
Symmetric
- the right side is the mirror image of the left side of the distribution
Still symmetric, but wider spread
Not symmetric
Copyright ©2015 Pearson Education, Inc. 2-*
Constructing a Frequency Distribution Using Grouped Quantitative Data
Ideally, the number of classes in a frequency distribution should be between 4 and 20
- Some data sets, particularly those with continuous data, require several values to be grouped together in a single class
- This grouping prevents having too many classes in the frequency distribution, which can make it difficult to detect patterns
Copyright ©2015 Pearson Education, Inc. 2-*
Number of Classes
One method to determine the number of classes in a frequency distribution is the rule
2k n
where k = Number of classes
n = Number of data points
- Find the lowest value of k that satisfies the rule
Suppose n = 50
25 = 32 < 50 (k = 5 is too small)
26 = 64 > 50 (k = 6 is a good choice)
Copyright ©2015 Pearson Education, Inc. 2-*
Class Width
Once k is known, the width of each class can be found
- The width is the range of numbers to put into each class
- Round this estimate to a useful whole number that makes the frequency distribution more readable
Copyright ©2015 Pearson Education, Inc. 2-*
Class Width
There is no one correct answer for the class width
- The goal is to create a histogram to clearly and usefully show the pattern in the data
- Often there is more than one acceptable way to accomplish this
Copyright ©2015 Pearson Education, Inc. 2-*
Class Boundaries
Class boundaries represent the minimum and maximum values for each class
- Choose class boundaries that are easy to read
3 to less than 6 minutes 3.21 to less than 6.21 minutes
6 to less than 9 minutes vs. 6.21 to less than 9.21 minutes
9 to less than 12 minutes 9.21 to less than 12.21 minutes
Copyright ©2015 Pearson Education, Inc. 2-*
Class Frequencies
Find class frequencies by counting and recording the number of observations in each class
- this is easier when the data are sorted
Example:
Copyright ©2015 Pearson Education, Inc. 2-*
Rules for Classes for Grouped Data
Equal-size classes. All classes in the frequency distribution must be of equal width
Mutually exclusive classes. Class boundaries cannot overlap
Include all data values. Make sure all data values are accounted for in the total row of the frequency distribution
Avoid empty classes. It is undesirable for a histogram to display a class so narrow that there are no observations in it
Avoid open-ended classes (if possible). These violate the first rule of equal class sizes
Copyright ©2015 Pearson Education, Inc. 2-*
Constructing a Histogram with
Grouped Quantitative Data
For grouped data, the bins in Excel are the upper boundary for each class
For continuous data, remove the gaps between the bars in the histogram:
Right-click on any histogram bar to get a pop-up menu
Left-click on Format Data Series
In the dialog box, move the Gap Width slide all the way to the left
Close the Format Data Series dialog box
Copyright ©2015 Pearson Education, Inc. 2-*
Constructing a Histogram with
Grouped Quantitative Data
Additional formatting issues:
- Use a descriptive title for the graph
- Use descriptive labels for the axes
- Remove the redundant “Frequency” legend
- Remove gaps between bars
Copyright ©2015 Pearson Education, Inc. 2-*
The Consequences of
Too Few or Too Many Classes
Wide classes results in few class intervals
- Can obscure important patterns
- Gives a “blocky” distribution graph
- Summarizes the data too much
- Tells us little about the true
distribution shape
Too many narrow classes in a histogram also
has consequences
- Results in a “jagged” histogram
- Some classes may be empty
- Does not summarize the data enough
Copyright ©2015 Pearson Education, Inc. 2-*
Are They Discrete or Continuous Data?
Some data are technically discrete (counted, not measured) but are displayed in a continuous format
Examples
- Age
- Income
- Other discrete data sets containing a wide range of values
Copyright ©2015 Pearson Education, Inc. 2-*
The Polygon
A percentage polygon graphs the midpoint of each class as a line rather than a column
- The height of each midpoint represents the relative frequency of the corresponding class
- Used to compare the shape of two or more distributions on one graph
The cumulative percentage polygon, or ogive, is a line graph that plots the cumulative relative frequency distribution
Copyright ©2015 Pearson Education, Inc. 2-*
The Polygon
Percentage polygons and cumulative percentage polygons can be created using PHStat
Copyright ©2015 Pearson Education, Inc. 2-*
The Polygon
Copyright ©2015 Pearson Education, Inc. 2-*
2.3 Displaying Qualitative Data
Qualitative data are values that are categorical
- Can be nominal or ordinal measurement level
- Describe a characteristic, such as gender or level of education
Frequency distributions help display qualitative data by indicating the number of occurrences of various categories
- Can use Excel’s COUNTIF function to count the number of values matching a category label
Copyright ©2015 Pearson Education, Inc. 2-*
Displaying Qualitative Data
Figure 2.15 A-B |
Excel’s COUNTIF Function
Excel’s COUNTIF
Function Results
Copyright ©2015 Pearson Education, Inc. 2-*
Bar Charts
Bar charts are a good tool for displaying qualitative data that have been organized in categories
Can be arranged in a vertical or horizontal orientation
Copyright ©2015 Pearson Education, Inc. 2-*
Bar Charts
Horizontal bar chart Vertical bar chart
Can display multiple series with clustered bar charts or stacked bar charts:
Copyright ©2015 Pearson Education, Inc. 2-*
Displaying Qualitative Data: Example
Copyright ©2015 Pearson Education, Inc. 2-*
Pareto Charts
Pareto charts are bar charts that show the frequency of the categories that cause quality control problems
Show quality problem categories in decreasing order
- The most problematic categories are shown first
Pareto charts also plot the cumulative relative frequency as a line on the chart known as an ogive
Copyright ©2015 Pearson Education, Inc. 2-*
Pareto Charts
Note: The categories are arranged from most frequent to least frequent
Follow the steps shown in the text, pages 49-50, to create a Pareto chart and ogive using Excel
Copyright ©2015 Pearson Education, Inc. 2-*
Pie Charts
Pie charts are another excellent tool for comparing proportions for categorical data
Each segment of the pie represents the relative frequency of one category
- All categories in the data set must be included in the pie
- Use a pie chart to compare the relative sizes of all possible categories
- Bar charts are more useful when you want to highlight the actual data values and when the classes combined don’t form a whole
Copyright ©2015 Pearson Education, Inc. 2-*
Pie Charts
Constructing a Pie Chart in Excel
Figure 2.19A |
Copyright ©2015 Pearson Education, Inc. 2-*
Pie Charts
Constructing a Pie Chart in Excel
(continued)
Figure 2.19B |
Copyright ©2015 Pearson Education, Inc. 2-*
Pie Charts
Example:
Copyright ©2015 Pearson Education, Inc. 2-*
2.4 Contingency Tables
Contingency tables provide a format to display observations that have more than one value associated with them
- Use rows and columns for separate variables to summarize the data efficiently
Copyright ©2015 Pearson Education, Inc. 2-*
Contingency Tables
7 females out of 20 customers paid using credit, 7/20 = 0.35
Contingency Table
Relative Contingency Table
Copyright ©2015 Pearson Education, Inc. 2-*
Constructing a Contingency Table in Excel
Click on any cell within your data
Choose the Insert tab
Click on the Pivot Table icon
Click on Pivot Table in the drop-down menu
A Create Pivot Table dialog box will appear. Click OK…
Copyright ©2015 Pearson Education, Inc. 2-*
Constructing a Contingency Table in Excel
A new worksheet will be created for your pivot table
From the Pivot Table Field List,
c. Drag the variable name to be summarized down into the Values box
a & b. Drag the desired variable names down into the Column or Row Labels boxes, as desired
a
b
c
Copyright ©2015 Pearson Education, Inc. 2-*
Constructing a Contingency Table in Excel
Resulting pivot table:
Creating a Pivot Table in Excel (Final Result)
Figure 2.21C |
Copyright ©2015 Pearson Education, Inc. 2-*
2.5 Stem and Leaf Display
A stem and leaf display splits the data values into stems (the larger place values) and leaves (the smaller place value)
By listing all of the leaves to the right of each stem, we can graphically describe how the data are distributed
- All the original data points are visible on the display
- Easy to construct by hand
- Provides a histogram-like view of the distribution
Copyright ©2015 Pearson Education, Inc. 2-*
Stem and Leaf Display
For this example, use the 10’s digit as the stem
Use the 1’s digit as the leaf
7 | 8
8 | 0
Copyright ©2015 Pearson Education, Inc. 2-*
Stem and Leaf Display
Sort the data from lowest to highest
Determine the unique stem values
7, 8, 9 are the different stem values in this example
List the stems in a vertical column and then add the leaf values to the right of the appropriate stem, in ascending order
7 | 8 8 9 9 9
8 | 0 0 0 0 1 1 2 3 3 4 4 4 5 6 7 8
9 | 0 2 5
Copyright ©2015 Pearson Education, Inc. 2-*
Stem and Leaf Display
To get more detail the stems can be split in half
7(5) | 8 8 9 9 9
8(0) | 0 0 0 0 1 1 2 3 3 4 4 4
8(5) | 5 6 7 8
9(0) | 0 2
9(5) | 5
- The stem labeled 7(5) stores all the scores between 75 and 79
- The stem 8(0) stores all the scores between 80 and 84
Copyright ©2015 Pearson Education, Inc. 2-*
2.6 Scatter Plots
Scatter plots provide a picture of the relationship between two data points that are paired together
The dependent variable, which is placed on the vertical axis of the scatter plot, is influenced by changes in the independent variable, which is placed on the horizontal axis
Copyright ©2015 Pearson Education, Inc. 2-*
Scatter Plots
Dependent variable
(y-axis)
Independent variable (x-axis)
Copyright ©2015 Pearson Education, Inc. 2-*
Scatter Plots
Constructing a Scatter Plot in Excel
Figure 2.25A |
Copyright ©2015 Pearson Education, Inc. 2-*
Line Charts
A line chart is a scatter plot in which the data points in the scatter plot are connected with line segments
- Often used with time series data
When graphing a time series the convention is to place the time data on the horizontal axis
Copyright ©2015 Pearson Education, Inc. 2-*
Line Charts
Constructing a Line Chart in Excel
Figure 2.26A |
Copyright ©2015 Pearson Education, Inc. 2-*
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher.
Printed in the United States of America.
k
value
data
Minimum
value
data
Maximum
width
class
Estimated
-
=