week 2 stats homework
Regression Models
4
To accompany Quantitative Analysis for Management, Twelfth Edition,
by Render, Stair, Hanna and Hale
Power Point slides created by Jeff Heyl
Copyright ©2015 Pearson Education, Inc.
After completing this chapter, students will be able to:
LEARNING OBJECTIVES
Copyright ©2015 Pearson Education, Inc.
4 – 2
Identify variables and use them in a regression model
Develop simple linear regression equations from sample data and interpret the slope and intercept
Compute the coefficient of determination and the coefficient of correlation and interpret their meanings
Interpret the F test in a linear regression model
List the assumptions used in regression and use residual plots to identify problems
After completing this chapter, students will be able to:
LEARNING OBJECTIVES
Copyright ©2015 Pearson Education, Inc.
4 – 3
Develop a multiple regression model and use it for prediction purposes
Use dummy variables to model categorical data
Determine which variables should be included in a multiple regression model
Transform a nonlinear function into a linear one for use in regression
Understand and avoid common mistakes made in the use of regression analysis
Copyright ©2015 Pearson Education, Inc.
4 – 4
4.1 Introduction
4.2 Scatter Diagrams
4.3 Simple Linear Regression
4.4 Measuring the Fit of the Regression Model
4.5 Assumptions of the Regression Model
4.6 Testing the Model for Significance
CHAPTER OUTLINE
Copyright ©2015 Pearson Education, Inc.
4 – 5
4.7 Using Computer Software for Regression
4.8 Multiple Regression Analysis
4.9 Binary or Dummy Variables
4.10 Model Building
4.11 Nonlinear Regression
4.12 Cautions and Pitfalls in Regression Analysis
CHAPTER OUTLINE
Introduction
Regression analysis – very valuable tool for a manager
Understand the relationship between variables
Predict the value of one variable based on another variable
Simple linear regression models have only two variables
Multiple regression models have more than one independent variable
Copyright ©2015 Pearson Education, Inc.
4 – 6
Introduction
Variable to be predicted is called the dependent variable or response variable
Value depends on the value of the independent variable(s)
Explanatory or predictor variable
Copyright ©2015 Pearson Education, Inc.
4 – 7
Independent variable
Dependent variable
Independent variable
= +
Scatter Diagram
Scatter diagram or scatter plot often used to investigate the relationship between variables
Independent variable normally plotted on X axis
Dependent variable normally plotted on Y axis
Copyright ©2015 Pearson Education, Inc.
4 – 8
Triple A Construction
Triple A Construction renovates old homes
The dollar volume of renovation work is dependent on the area payroll
Copyright ©2015 Pearson Education, Inc.
4 – 9
| TRIPLE A’S SALES ($100,000s) | LOCAL PAYROLL ($100,000,000s) |
| 6 | 3 |
| 8 | 4 |
| 9 | 6 |
| 5 | 4 |
| 4.5 | 2 |
| 9.5 | 5 |
TABLE 4.1
Triple A Construction
Figure 4.1 – Scatter Diagram
Copyright ©2015 Pearson Education, Inc.
4 – 10
| | | | | | | | |
0 1 2 3 4 5 6 7 8
12 –
10 –
8 –
6 –
4 –
2 –
0 –
Sales ($100,000)
Payroll ($100 million)
Simple Linear Regression
Regression models used to test relationships between variables
Random error
Copyright ©2015 Pearson Education, Inc.
4 – 11
where
Y = dependent variable (response)
X = independent variable (predictor or explanatory)
0 = intercept (value of Y when X = 0)
1 = slope of the regression line
e = random error
Simple Linear Regression
True values for the slope and intercept are not known
Estimated using sample data
Copyright ©2015 Pearson Education, Inc.
4 – 12
where
Y = predicted value of Y
b0 = estimate of β0, based on sample results
b1 = estimate of β1, based on sample results
^
Triple A Construction
Predict sales based on area payroll
Y = Sales
X = Area payroll
The line Figure 4.1 minimizes the errors
Copyright ©2015 Pearson Education, Inc.
4 – 13
Error = (Actual value) – (Predicted value)
Regression analysis minimizes the sum of squared errors
Least-squares regression
Triple A Construction
Formulas for simple linear regression, intercept and slope
Copyright ©2015 Pearson Education, Inc.
4 – 14
Triple A Construction
Copyright ©2015 Pearson Education, Inc.
4 – 15
| Y | X | (X – X)2 | (X – X)(Y – Y) |
| 6 | 3 | (3 – 4)2 = 1 | (3 – 4)(6 – 7) = 1 |
| 8 | 4 | (4 – 4)2 = 0 | (4 – 4)(8 – 7) = 0 |
| 9 | 6 | (6 – 4)2 = 4 | (6 – 4)(9 – 7) = 4 |
| 5 | 4 | (4 – 4)2 = 0 | (4 – 4)(5 – 7) = 0 |
| 4.5 | 2 | (2 – 4)2 = 4 | (2 – 4)(4.5 – 7) = 5 |
| 9.5 | 5 | (5 – 4)2 = 1 | (5 – 4)(9.5 – 7) = 2.5 |
| ΣY = 42 Y = 42/6 = 7 | ΣX = 24 X = 24/6 = 4 | Σ(X – X)2 = 10 | Σ(X – X)(Y – Y) = 12.5 |
TABLE 4.2 – Regression calculations
Triple A Construction
Regression calculations
Copyright ©2015 Pearson Education, Inc.
4 – 16
Therefore
Regression calculations
Triple A Construction
Copyright ©2015 Pearson Education, Inc.
4 – 17
Therefore
sales = 2 + 1.25(payroll)
If the payroll next year is $600 million
Measuring the Fit of the Regression Model
Regression models can be developed for any variables X and Y
How helpful is the model in predicting Y?
With average error positive and negative errors cancel each other out
Three measures of variability
SST – Total variability about the mean
SSE – Variability about the regression line
SSR – Total variability that is explained by the model
Copyright ©2015 Pearson Education, Inc.
4 – 18
Measuring the Fit of the Regression Model
Sum of squares total
Sum of squares error
Sum of squares regression
An important relationship
Copyright ©2015 Pearson Education, Inc.
4 – 19
Measuring the Fit of the Regression Model
Copyright ©2015 Pearson Education, Inc.
4 – 20
| Y | X | (Y – Y)2 | Y | (Y – Y)2 | (Y – Y)2 |
| 6 | 3 | (6 – 7)2 = 1 | 2 + 1.25(3) = 5.75 | 0.0625 | 1.563 |
| 8 | 4 | (8 – 7)2 = 1 | 2 + 1.25(4) = 7.00 | 1 | 0 |
| 9 | 6 | (9 – 7)2 = 4 | 2 + 1.25(6) = 9.50 | 0.25 | 6.25 |
| 5 | 4 | (5 – 7)2 = 4 | 2 + 1.25(4) = 7.00 | 4 | 0 |
| 4.5 | 2 | (4.5 – 7)2 = 6.25 | 2 + 1.25(2) = 4.50 | 0 | 6.25 |
| 9.5 | 5 | (9.5 – 7)2 = 6.25 | 2 + 1.25(5) = 8.25 | 1.5625 | 1.563 |
| ∑(Y – Y)2 = 22.5 | ∑(Y – Y)2 | = 6.875 | ∑(Y – Y)2 = 15.625 | ||
| Y = 7 | SST = 22.5 | SSE | = 6.875 | SSR = 15.625 |
^
^
^
^
^
TABLE 4.3 – Sum of Squares for Triple A Construction
Sum of squares total
Sum of squares error
Sum of squares regression
An important relationship
Measuring the Fit of the Regression Model
Copyright ©2015 Pearson Education, Inc.
4 – 21
For Triple A Construction
SST = 22.5
SSE = 6.875
SSR = 15.625
Measuring the Fit of the Regression Model
Copyright ©2015 Pearson Education, Inc.
4 – 22
FIGURE 4.2 – Deviations from the Regression Line and from the Mean
| | | | | | | | |
0 1 2 3 4 5 6 7 8
12 –
10 –
8 –
6 –
4 –
2 –
0 –
Sales ($100,000)
Payroll ($100 million)
Coefficient of Determination
The proportion of the variability in Y explained by the regression equation
The coefficient of determination is r2.
Copyright ©2015 Pearson Education, Inc.
4 – 23
For Triple A Construction
The proportion of the variability in Y explained by the regression equation
The coefficient of determination is r2.
Coefficient of Determination
Copyright ©2015 Pearson Education, Inc.
4 – 24
For Triple A Construction
About 69% of the variability in Y is explained by the equation based on payroll (X)
Correlation Coefficient
An expression of the strength of the linear relationship
Always between +1 and –1
The correlation coefficient is r
Copyright ©2015 Pearson Education, Inc.
4 – 25
For Triple A Construction
Four Values of the Correlation Coefficient
(a) Perfect Positive Correlation: r = +1
X
Y
(c) No Correlation: r = 0
X
Y
(d) Perfect Negative Correlation: r = –1
X
Y
(b) Positive Correlation: 0 < r < 1
X
Y
FIGURE 4.3
Copyright ©2015 Pearson Education, Inc.
4 – 26
Assumptions of the Regression Model
With certain assumptions about the errors, statistical tests can be performed to determine the model’s usefulness
Errors are independent
Errors are normally distributed
Errors have a mean of zero
Errors have a constant variance
A plot of the residuals (errors) often highlights glaring violations of assumptions
Copyright ©2015 Pearson Education, Inc.
4 – 27
Residual Plots
FIGURE 4.4A – Pattern of Errors Indicating Randomness
Error
X
Copyright ©2015 Pearson Education, Inc.
4 – 28
Residual Plots
FIGURE 4.4B – Nonconstant error variance
Error
X
Copyright ©2015 Pearson Education, Inc.
4 – 29
Residual Plots
FIGURE 4.4C – Errors Indicate Relationship is not Linear
Error
X
Copyright ©2015 Pearson Education, Inc.
4 – 30
Estimating the Variance
Errors are assumed to have a constant variance ( 2), usually unknown
Estimated using the mean squared error (MSE), s2
Copyright ©2015 Pearson Education, Inc.
4 – 31
where
n = number of observations in the sample
k = number of independent variables
Estimating the Variance
For Triple A Construction
Copyright ©2015 Pearson Education, Inc.
4 – 32
Estimate the standard deviation, s
The standard error of the estimate or the standard deviation of the regression
Testing the Model for Significance
When the sample size is too small, you can get good values for MSE and r2 even if there is no relationship between the variables
Testing the model for significance helps determine if the values are meaningful
Performing a statistical hypothesis test
Copyright ©2015 Pearson Education, Inc.
4 – 33
Testing the Model for Significance
We start with the general linear model
Copyright ©2015 Pearson Education, Inc.
4 – 34
If 1 = 0, the null hypothesis is that there is no relationship between X and Y
The alternate hypothesis is that there is a linear relationship (1 ≠ 0)
If the null hypothesis can be rejected, we have proven there is a relationship
We use the F statistic
Testing the Model for Significance
The F statistic is based on the MSE and MSR
Copyright ©2015 Pearson Education, Inc.
4 – 35
where
k = number of independent variables in the model
Describes an F distribution with:
degrees of freedom for the numerator = df1 = k
degrees of freedom for the denominator = df2 = n – k – 1
The F statistic is
Testing the Model for Significance
If there is very little error, MSE would be small and the F statistic would be large – model is useful
If the F statistic is large, the significance level (p-value) will be low, – unlikely would have occurred by chance
When the F value is large, we can reject the null hypothesis and accept that there is a linear relationship between X and Y and the values of the MSE and r2 are meaningful
Copyright ©2015 Pearson Education, Inc.
4 – 36
Steps in a Hypothesis Test
Specify null and alternative hypotheses
Copyright ©2015 Pearson Education, Inc.
4 – 37
Select the level of significance ()
Common values are 0.01 and 0.05.
Calculate the value of the test statistic
Steps in a Hypothesis Test
Make a decision using one of the following methods
Copyright ©2015 Pearson Education, Inc.
4 – 38
Reject the null hypothesis if the test statistic is greater than the F value from the table in Appendix D. Otherwise, do not reject the null hypothesis:
Reject the null hypothesis if the observed significance level, or p-value, is less than the level of significance (). Otherwise, do not reject the null hypothesis:
Triple A Construction
Step 1
Copyright ©2015 Pearson Education, Inc.
4 – 39
H0: 1 = 0 (no linear relationship between X and Y)
H1: 1 ≠ 0 (linear relationship exists between X and Y)
Step 2
Select = 0.05
Step 3
– Calculate the value of the test statistic
Triple A Construction
Step 4
Reject the null hypothesis if the test statistic is greater than the F value in Appendix D
Copyright ©2015 Pearson Education, Inc.
4 – 40
df1 = k = 1
df2 = n – k – 1 = 6 – 1 – 1 = 4
The value of F associated with a 5% level of significance and with degrees of freedom 1 and 4 is found in Appendix D.
F0.05,1,4 = 7.71
Fcalculated = 9.09
Reject H0 because 9.09 > 7.71
F = 7.71
0.05
9.09
Triple A Construction
We can conclude there is a statistically significant relationship between X and Y
The r2 value of 0.69 means about 69% of the variability in sales (Y) is explained by local payroll (X)
Copyright ©2015 Pearson Education, Inc.
4 – 41
FIGURE 4.5
Analysis of Variance (ANOVA) Table
With software models, an ANOVA table is typically created that shows the observed significance level (p-value) for the calculated F value
This can be compared to the level of significance () to make a decision
4 – 42
| DF | SS | MS | F | SIGNIFICANCE | |
| Regression | k | SSR | MSR = SSR/k | MSR/MSE | P(F > MSR/MSE) |
| Residual | n - k - 1 | SSE | MSE = SSE/(n - k - 1) | ||
| Total | n - 1 | SST |
Copyright ©2015 Pearson Education, Inc.
TABLE 4.4
ANOVA for Triple A Construction
4 – 43
PROGRAM 4.1C – Excel Output for Triple A Construction
P(F > 9.0909) = 0.0394
Copyright ©2015 Pearson Education, Inc.
Using Software
Copyright ©2015 Pearson Education, Inc.
4 – 44
PROGRAM 4.1A – Accessing the Regression Option in Excel
Using Software
Copyright ©2015 Pearson Education, Inc.
4 – 45
PROGRAM 4.1B – Data Input
Using Software
Copyright ©2015 Pearson Education, Inc.
4 – 46
PROGRAM 4.1C – Excel Output
Using Software
Copyright ©2015 Pearson Education, Inc.
4 – 47
PROGRAM 4.2A – Using Excel QM
Using Software
Copyright ©2015 Pearson Education, Inc.
4 – 48
PROGRAM 4.2B – Initializing the Spreadsheet
Using Software
Copyright ©2015 Pearson Education, Inc.
4 – 49
PROGRAM 4.2C – Input and Results
Using Software
Copyright ©2015 Pearson Education, Inc.
4 – 50
PROGRAM 4.3A – QM for Windows Regression Option in Forecasting Module
Using Software
Copyright ©2015 Pearson Education, Inc.
4 – 51
PROGRAM 4.3B – QM for Windows Screen to Initialize the Problem
Using Software
Copyright ©2015 Pearson Education, Inc.
4 – 52
PROGRAM 4.3C – Data Input
Using Software
Copyright ©2015 Pearson Education, Inc.
4 – 53
PROGRAM 4.3D – QM for Windows Output
Multiple Regression Analysis
Extensions to the simple linear model
Models with more than one independent variable
4 – 54
Y = 0 + 1X1 + 2X2 + … + kXk +
where
Y = dependent variable (response variable)
Xi = ith independent variable (predictor or explanatory variable)
0 = intercept (value of Y when all Xi = 0)
i = coefficient of the ith independent variable
k = number of independent variables
= random error
Copyright ©2015 Pearson Education, Inc.
Multiple Regression Analysis
To estimate these values, a sample is taken the following equation developed
4 – 55
Where
= predicted value of Y
b0 = sample intercept (an estimate of 0)
bi = sample coefficient of the ith variable (an estimate of i)
Copyright ©2015 Pearson Education, Inc.
Jenny Wilson Realty
Develop a model to determine the suggested listing price for houses based on the size and age of the house
4 – 56
where
= predicted value of dependent variable (selling price)
b0 = Y intercept
X1 and X2 = value of the two independent variables (square footage and age) respectively
b1 and b2 = slopes for X1 and X2 respectively
– Selects a sample of houses that have sold recently and records the data
Copyright ©2015 Pearson Education, Inc.
Jenny Wilson Real Estate Data
4 – 57
| SELLING PRICE ($) | SQUARE FOOTAGE | AGE | CONDITION |
| 95,000 | 1,926 | 30 | Good |
| 119,000 | 2,069 | 40 | Excellent |
| 124,800 | 1,720 | 30 | Excellent |
| 135,000 | 1,396 | 15 | Good |
| 142,000 | 1,706 | 32 | Mint |
| 145,000 | 1,847 | 38 | Mint |
| 159,000 | 1,950 | 27 | Mint |
| 165,000 | 2,323 | 30 | Excellent |
| 182,000 | 2,285 | 26 | Mint |
| 183,000 | 3,752 | 35 | Good |
| 200,000 | 2,300 | 18 | Good |
| 211,000 | 2,525 | 17 | Good |
| 215,000 | 3,800 | 40 | Excellent |
| 219,000 | 1,740 | 12 | Mint |
TABLE 4.5
Copyright ©2015 Pearson Education, Inc.
Jenny Wilson Realty
PROGRAM 4.2A – Input Screen for the Jenny Wilson Realty Multiple Regression Example
Copyright ©2015 Pearson Education, Inc.
4 – 58
Jenny Wilson Realty
Program 4.2B – Excel Output
Copyright ©2015 Pearson Education, Inc.
4 – 59
Evaluating Multiple Regression Models
Similar to simple linear regression models
The p-value for the F test and r2 interpreted the same
The hypothesis is different because there is more than one independent variable
The F test is investigating whether all the coefficients are equal to 0 at the same time
4 – 60
Copyright ©2015 Pearson Education, Inc.
Evaluating Multiple Regression Models
To determine which independent variables are significant, tests are performed for each variable
The test statistic is calculated and if the p-value is lower than the level of significance (), the null hypothesis is rejected
4 – 61
Copyright ©2015 Pearson Education, Inc.
Jenny Wilson Realty
Full model is statistically significant
Useful in predicting selling price
4 – 62
Copyright ©2015 Pearson Education, Inc.
Are both variables significant?
For X1 (square footage)
p-value for F test = 0.002 r2 = 0.6719
For a = 0.05, p-value = 0.0013 null hypothesis is rejected
For X1 (age)
For a = 0.05, p-value = 0.0039 null hypothesis is rejected
Full model is statistically significant
Useful in predicting selling price
Jenny Wilson Realty
4 – 63
Copyright ©2015 Pearson Education, Inc.
Are both variables significant?
For X1 – square footage
p-value for F test = 0.002 r2 = 0.6719
For a = 0.05, p-value = 0.0013 null hypothesis is rejected
For X1 – age
For a = 0.05, p-value = 0.0039 null hypothesis is rejected
Both square footage and age are helpful in predicting the selling price
Binary or Dummy Variables
Binary (or dummy or indicator) variables are special variables created for qualitative data
A dummy variable is assigned a value of 1 if a particular condition is met and a value of 0 otherwise
The number of dummy variables must equal one less than the number of categories of the qualitative variable
4 – 64
Copyright ©2015 Pearson Education, Inc.
Jenny Wilson Realty
A better model can be developed if information about the condition of the property is included
4 – 65
X3 = 1 if house is in excellent condition
= 0 otherwise
X4 = 1 if house is in mint condition
= 0 otherwise
Copyright ©2015 Pearson Education, Inc.
Two dummy variables are used to describe the three categories of condition
No variable is needed for “good” condition since if both X3 and X4 = 0, the house must be in good condition
Jenny Wilson Realty
PROGRAM 4.5A – Excel Input Screen with Dummy Variables
Copyright ©2015 Pearson Education, Inc.
4 – 66
Jenny Wilson Realty
PROGRAM 4.5B – Excel Output with Dummy Variables
Copyright ©2015 Pearson Education, Inc.
4 – 67
Jenny Wilson Realty
PROGRAM 4.5B – Excel Output with Dummy Variables
Copyright ©2015 Pearson Education, Inc.
4 – 68
Coefficient of determination, r2 = 0.898
Model Building
The best model is a statistically significant model with a high r2 and few variables
As more variables are added to the model, the r2 value increases
For this reason, the adjusted r2 value is often used to determine the usefulness of an additional variable
The adjusted r2 takes into account the number of independent variables in the model
4 – 69
Copyright ©2015 Pearson Education, Inc.
Model Building
The formula for r2
The formula for adjusted r2
As the number of variables increases, the adjusted r2 gets smaller unless the increase due to the new variable is large enough to offset the change in k
4 – 70
Copyright ©2015 Pearson Education, Inc.
The formula for r2
The formula for adjusted r2
As the number of variables increases, the adjusted r2 gets smaller unless the increase due to the new variable is large enough to offset the change in k
Model Building
4 – 71
Copyright ©2015 Pearson Education, Inc.
In general, if a new variable increases the adjusted r2, it should probably be included in the model
Model Building
Stepwise regression systematically adds or deletes independent variables
A forward stepwise procedure puts the most significant variable in first, adds the next variable that will improve the model the most
Backward stepwise regression begins with all the independent variables and deletes the least helpful
4 – 72
Copyright ©2015 Pearson Education, Inc.
Model Building
In some cases variables contain duplicate information
When two independent variables are correlated, they are said to be collinear
When more than two independent variables are correlated, multicollinearity exists
When multicollinearity is present, hypothesis tests for the individual coefficients are not valid but the model may still be useful
4 – 73
Copyright ©2015 Pearson Education, Inc.
Nonlinear Regression
In some situations, variables are not linear
Transformations may be used to turn a nonlinear model into a linear model
*
*
*
*
*
*
*
*
*
Linear relationship
Nonlinear relationship
*
*
*
*
*
*
*
*
*
*
*
Copyright ©2015 Pearson Education, Inc.
4 – 74
Colonel Motors
Use regression analysis to improve fuel efficiency
Study the impact of weight on miles per gallon (MPG)
4 – 75
Copyright ©2015 Pearson Education, Inc.
| MPG | WEIGHT (1,000 LBS.) | MPG | WEIGHT (1,000 LBS.) | |||
| 12 | 4.58 | 20 | 3.18 | |||
| 13 | 4.66 | 23 | 2.68 | |||
| 15 | 4.02 | 24 | 2.65 | |||
| 18 | 2.53 | 33 | 1.70 | |||
| 19 | 3.09 | 36 | 1.95 | |||
| 19 | 3.11 | 42 | 1.92 |
TABLE 4.6
Colonel Motors
4 – 76
Copyright ©2015 Pearson Education, Inc.
FIGURE 4.6A – Linear Model for MPG Data
| | | | | |
1.00 2.00 3.00 4.00 5.00
45 –
40 –
35 –
30 –
25 –
20 –
15 –
10 –
5 –
0 –
Weight (1,000 lb.)
MPG
Colonel Motors
PROGRAM 4.6 – Excel Output for Linear Regression Model with MPG Data
Copyright ©2015 Pearson Education, Inc.
4 – 77
Useful model
Small F test for significance
Good r2 value
Colonel Motors
4 – 78
Copyright ©2015 Pearson Education, Inc.
FIGURE 4.6B – Nonlinear Model for MPG Data
| | | | | |
1.00 2.00 3.00 4.00 5.00
45 –
40 –
35 –
30 –
25 –
20 –
15 –
10 –
5 –
0 –
Weight (1,000 lb.)
MPG
Colonel Motors
The nonlinear model is a quadratic model
The easiest approach – develop a new variable
4 – 79
Copyright ©2015 Pearson Education, Inc.
New model
Colonel Motors
PROGRAM 4.7 – Excel Output for Nonlinear Regression Model with MPG Data
Copyright ©2015 Pearson Education, Inc.
4 – 80
Improved model
Small F test for significance
Adjusted r2 and r2 both increased
Cautions and Pitfalls
If the assumptions are not met, the statistical test may not be valid
Correlation does not necessarily mean causation
Multicollinearity makes interpreting coefficients problematic, but the model may still be good
Using a regression model beyond the range of X is questionable, as the relationship may not hold outside the sample data
4 – 81
Copyright ©2015 Pearson Education, Inc.
Cautions and Pitfalls
A t-test for the intercept (b0) may be ignored as this point is often outside the range of the model
A linear relationship may not be the best relationship, even if the F test returns an acceptable value
A nonlinear relationship can exist even if a linear relationship does not
Even though a relationship is statistically significant it may not have any practical value
4 – 82
Copyright ©2015 Pearson Education, Inc.
Copyright
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher. Printed in the United States of America.
1
1.6 The Role of Computers and Spreadsheet Models in the Quantitative Analysis Approach
1.7 Possible Problems in the Quantitative Analysis Approach
1.8 Implementation—Not Just the Final Step
1.1 Introduction 1.2 What Is Quantitative Analysis? 1.3 Business Analytics 1.4 The Quantitative Analysis Approach 1.5 How to Develop a Quantitative Analysis Model
CHAPTER OUTLINE
5. Use computers and spreadsheet models to perform quantitative analysis.
6. Discuss possible problems in using quantitative analysis.
7. Perform a break-even analysis.
1. Describe the quantitative analysis approach. 2. Understand the application of quantitative analysis
in a real situation. 3. Describe the three categories of business analytics. 4. Describe the use of modeling in quantitative
analysis.
After completing this chapter, students will be able to:
Introduction to Quantitative Analysis
1CHAPTER
LEARNING OBJECTIVES
M01_REND7331_12_SE_C01_pp2.indd 1 01/10/13 9:50 AM
1
1.6 The Role of Computers and Spreadsheet Models in the Quantitative Analysis Approach
1.7 Possible Problems in the Quantitative Analysis Approach
1.8 Implementation—Not Just the Final Step
1.1 Introduction 1.2 What Is Quantitative Analysis? 1.3 Business Analytics 1.4 The Quantitative Analysis Approach 1.5 How to Develop a Quantitative Analysis Model
CHAPTER OUTLINE
5. Use computers and spreadsheet models to perform quantitative analysis.
6. Discuss possible problems in using quantitative analysis.
7. Perform a break-even analysis.
1. Describe the quantitative analysis approach. 2. Understand the application of quantitative analysis
in a real situation. 3. Describe the three categories of business analytics. 4. Describe the use of modeling in quantitative
analysis.
After completing this chapter, students will be able to:
Introduction to Quantitative Analysis
1CHAPTER
LEARNING OBJECTIVES
M01_REND7331_12_SE_C01_pp2.indd 1 01/10/13 9:50 AM
Y = β0 +β1X +ε
Y=b
0
+b
1
X+e
Ŷ = b0 +b1X
ˆ
Y=b
0
+b
1
X
e =Y −Ŷ
e=Y-
ˆ
Y
Ŷ = b0 +b1X
ˆ
Y=b
0
+b
1
X
X = X∑ n
= average (mean) of X values
Y = Y∑ n
= average (mean) of Y values
b1 = (X − X∑ )(Y −Y) (X − X)2∑
b0 =Y −b1X
X=
X
å
n
=average (mean) of X values
Y=
Y
å
n
=average (mean) of Y values
b
1
=
(X-X
å
)(Y-Y)
(X-X)
2
å
b
0
=Y-b
1
X
X = X∑ 6
= 24 6 = 4
X=
X
å
6
=
24
6
=4
Y = Y∑ 6
= 42 6 =7
Y=
Y
å
6
=
42
6
=7
b1 = (X – X∑ )(Y –Y) (X – X)2∑
= 12.5 10
=1.25
b0 =Y −b1X = 7 – (1.25)(4) = 2
b
1
=
(X–X
å
)(Y–Y)
(X–X)
2
å
=
12.5
10
=1.25
b
0
=Y-b
1
X=7 – (1.25)(4)=2
Ŷ = 2 + 1.25X
ˆ
Y=2 + 1.25X
Ŷ = 2 + 1.25(6) = 9.5 or $ 950,000
ˆ
Y=2 + 1.25(6)=9.5 or $ 950,000
SST = (Y −Y∑ )2
SST=(Y-Y
å
)
2
SSE= e2 = (Y −Ŷ)2∑∑
SSE=e
2
=(Y-
ˆ
Y)
2
åå
SSR= (Ŷ −Y)2∑
SSR=(
ˆ
Y-Y)
2
å
SST = SSR + SSE
SST=SSR + SSE
Ŷ = 2 + 1.25X
ˆ
Y = 2 + 1.25X
Y
Y
Y – Ŷ
Y –
ˆ
Y
Ŷ – Y
ˆ
Y – Y
Y – Y
Y – Y
r2 = SSR SST
=1 – SSE SST
r
2
=
SSR
SST
=1 –
SSE
SST
r2 = 15.625 22.5
=0.6944
r
2
=
15.625
22.5
=0.6944
r =± r2
r=±r
2
r = 0.6944 =0.8333
r=0.6944=0.8333
s2 =MSE= SSE n−k −1
s
2
=MSE=
SSE
n-k-1
s2 =MSE= SSE n−k −1
= 6.8750 6-1-1
= 6.8750 4
=1.7188
s
2
=MSE=
SSE
n-k-1
=
6.8750
6-1-1
=
6.8750
4
=1.7188
s = MSE = 1.7188 = 1.31
s = MSE = 1.7188= 1.31
Y = β0 +β1X +ε
Y=b
0
+b
1
X+e
MSR= SSR k
MSR=
SSR
k
F = MSR MSE
F=
MSR
MSE
H0 :β1 =0 H1 :β1 ≠0
H
0
:b
1
=0
H
1
:b
1
¹0
F = MSR MSE
F=
MSR
MSE
Reject if Fcalculated > Fα,df1,df2 df1 = k df2 = n−k −1
Reject if F
calculated
>F
a,df
1
,df
2
df
1
=k
df
2
=n-k-1
p-value= P(F > calculated test statistic) Reject if p-value<α
p-value=P(F>calculated test statistic)
Reject if p-value<a
MSR= SSR k
= 15.6250
1 =15.6250
F = MSR MSE
= 15.6250 1.7188
=9.09
MSR=
SSR
k
=
15.6250
1
=15.6250
F=
MSR
MSE
=
15.6250
1.7188
=9.09
Enter the past observations of Y and X. Results appear automatically.
To forecast Y based on any value of X, simply input the value of X here.
Correlation coef!cient is given here.
The intercept and slope are shown here.
After selecting the Forecasting module, click New and select Least Squares – Simple and Multiple Regression.
Specify the number of observations.
Specify the number of variables.
Click OK.
Ŷ = b0 +b1X1 +b2X2 +...+bkXk
ˆ
Y=b
0
+b
1
X
1
+b
2
X
2
+...+b
k
X
k
Ŷ
ˆ
Y
Ŷ = b0 +b1X1 +b2X2
ˆ
Y=b
0
+b
1
X
1
+b
2
X
2
Ŷ
ˆ
Y
The columns containing data for the X variables should be adjoining.
The X range is from the start of the !rst column to the bottom of the second column.
Ŷ = b0 +b1X1 +b2X2 =146,630.89+43.82X1 −2898.69X2
ˆ
Y=b
0
+b
1
X
1
+b
2
X
2
=146,630.89+43.82X
1
-2898.69X
2
The coef!cients are here.
The coef!cient of determination is 0.67.
A low Signi!cance F value indicates that the model is statistically signi!cant.
Ŷ =121,658 + 56.43X1 – 3,962X2 + 33,162X3 + 47,369X4
ˆ
Y=121,658 + 56.43X
1
– 3,962X
2
+ 33,162X
3
+ 47,369X
4
r2 = SSR SST
=1− SSE SST
r
2
=
SSR
SST
=1-
SSE
SST
Adjusted r2 =1− SSE / (n−k −1)
SST / (n−1)
Adjusted r
2
=1-
SSE/(n-k-1)
SST/(n-1)
Ŷ = 47.6−8.2X1 or MPG = 47.6 – 8.2 (weight in 1,000 lb.)
ˆ
Y=47.6-8.2X
1
or MPG= 47.6 – 8.2 (weight in 1,000 lb.)
X2 = (weight) 2
X
2
=(weight)
2
Ŷ = b0 +b1X1 +b2X2
ˆ
Y=b
0
+b
1
X
1
+b
2
X
2
Ŷ =79.8−30.2X1+3.4X2
ˆ
Y=79.8-30.2X
1
+3.4X
2