Statistic
Regression Analysis
using Excel 2007
MTH 305 Statistics
Data Needed in Regression Analysis
- At least two variables that have information about several observations
Only one variable will be defined as the Y variable. There can be one or more X variables in regression analysis.
| Observation ID | Variable 1 | Variable 2 |
| 1 | ||
| 2 | ||
| 3 |
Data Example
- For example, we are interested in analyzing the linear relationship between amount of sugar and calories in a box of cereals. We are testing whether sugar amount causes calories amount.
- In Excel the dataset will look like…see next slide
Data Example
Ways to Check Linear Relationship
Scatter Plot between Y and X
Correlation Value
Regression Analysis
SCATTER PLOT
Scatter-plot of Two variables
- Select the data of two variables you wish to analyze.
- Under Insert tab Chart and select “scatter plot”
- Example from data above:
Looks like there is no linear relationshiip!!!
CORRELATION COEFFICIENT
Correlation Value in Excel
- In any Excel cell, type:
=CORREL(range of Y data, range of X data)
- For example, for the dataset above (cereal data) where Y data are in cells B2 through B19 and X data are in cells C2 through C19, we will type:
=CORREL(B2:B19, C2:C19)
The result of 0.2296 shows that there is a weak relationship between those variables.
REGRESSION ANALYSIS
Regression Analysis
Under Data Data Analysis Regression
Excel Output: Intercept and Slope
The regression equation is:
| Regression Statistics | ||||||
| Multiple R | 0.76211 | |||||
| R Square | 0.58082 | |||||
| Adjusted R Square | 0.52842 | |||||
| Standard Error | 41.33032 | |||||
| Observations | 10 | |||||
| ANOVA | df | SS | MS | F | Significance F | |
| Regression | 1 | 18934.9348 | 18934.9348 | 11.0848 | 0.01039 | |
| Residual | 8 | 13665.5652 | 1708.1957 | |||
| Total | 9 | 32600.5000 | ||||
| Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | |
| Intercept | 98.24833 | 58.03348 | 1.69296 | 0.12892 | -35.57720 | 232.07386 |
| Square Feet | 0.10977 | 0.03297 | 3.32938 | 0.01039 | 0.03374 | 0.18580 |
Excel Output: R-squared
58.08% of the variation in house prices is explained by variation in square feet
| Regression Statistics | ||||||
| Multiple R | 0.76211 | |||||
| R Square | 0.58082 | |||||
| Adjusted R Square | 0.52842 | |||||
| Standard Error | 41.33032 | |||||
| Observations | 10 | |||||
| ANOVA | df | SS | MS | F | Significance F | |
| Regression | 1 | 18934.9348 | 18934.9348 | 11.0848 | 0.01039 | |
| Residual | 8 | 13665.5652 | 1708.1957 | |||
| Total | 9 | 32600.5000 | ||||
| Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | |
| Intercept | 98.24833 | 58.03348 | 1.69296 | 0.12892 | -35.57720 | 232.07386 |
| Square Feet | 0.10977 | 0.03297 | 3.32938 | 0.01039 | 0.03374 | 0.18580 |
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 13-*
Excel Output: Standard Error
| Regression Statistics | ||||||
| Multiple R | 0.76211 | |||||
| R Square | 0.58082 | |||||
| Adjusted R Square | 0.52842 | |||||
| Standard Error | 41.33032 | |||||
| Observations | 10 | |||||
| ANOVA | df | SS | MS | F | Significance F | |
| Regression | 1 | 18934.9348 | 18934.9348 | 11.0848 | 0.01039 | |
| Residual | 8 | 13665.5652 | 1708.1957 | |||
| Total | 9 | 32600.5000 | ||||
| Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | |
| Intercept | 98.24833 | 58.03348 | 1.69296 | 0.12892 | -35.57720 | 232.07386 |
| Square Feet | 0.10977 | 0.03297 | 3.32938 | 0.01039 | 0.03374 | 0.18580 |
To Plot Regression Line
Under the Regression window, place a mark
“Line Fit Plots”
Regression Line
- House price model: scatter plot and regression line
Slope
= 0.10977
Intercept
= 98.248
Chart2
| 1400 |
| 1600 |
| 1700 |
| 1875 |
| 1100 |
| 1550 |
| 2350 |
| 2450 |
| 1425 |
| 1700 |
Sheet4
| SUMMARY OUTPUT | ||||||
| Regression Statistics | ||||||
| Multiple R | 0.76211 | |||||
| R Square | 0.58082 | |||||
| Adjusted R Square | 0.52842 | |||||
| Standard Error | 41.33032 | |||||
| Observations | 10 | |||||
| ANOVA | ||||||
| df | SS | MS | F | Significance F | ||
| Regression | 1 | 18934.9348 | 18934.9348 | 11.08476 | 0.01039 | |
| Residual | 8 | 13665.5652 | 1708.1957 | |||
| Total | 9 | 32600.5000 | ||||
| Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | |
| Intercept | 98.24833 | 58.03348 | 1.69296 | 0.12892 | -35.57720 | 232.07386 |
| Square Feet | 0.10977 | 0.03297 | 3.32938 | 0.01039 | 0.03374 | 0.18580 |
| RESIDUAL OUTPUT | ||||||
| Observation | Predicted House Price | Residuals | ||||
| 1 | 251.9231625835 | -6.9231625835 | ||||
| 2 | 273.8767101495 | 38.1232898505 | ||||
| 3 | 284.8534839325 | -5.8534839325 | ||||
| 4 | 304.0628380528 | 3.9371619472 | ||||
| 5 | 218.9928412345 | -19.9928412345 | ||||
| 6 | 268.388323258 | -49.388323258 | ||||
| 7 | 356.2025135221 | 48.7974864779 | ||||
| 8 | 367.1792873051 | -43.1792873051 | ||||
| 9 | 254.6673560293 | 64.3326439707 | ||||
| 10 | 284.8534839325 | -29.8534839325 |
Sheet4
| 1400 | 1400 |
| 1600 | 1600 |
| 1700 | 1700 |
| 1875 | 1875 |
| 1100 | 1100 |
| 1550 | 1550 |
| 2350 | 2350 |
| 2450 | 2450 |
| 1425 | 1425 |
| 1700 | 1700 |
Sheet1
| House Price | Square Feet |
| 245 | 1400 |
| 312 | 1600 |
| 279 | 1700 |
| 308 | 1875 |
| 199 | 1100 |
| 219 | 1550 |
| 405 | 2350 |
| 324 | 2450 |
| 319 | 1425 |
| 255 | 1700 |
Sheet1
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
Sheet2
Sheet3
MULTIVARIATE REGRESSION
Multivariate Regression
Multivariate = two or more X variables than influence Y
- Scatter-Plot: get them separately for each pair of X and Y.
- Correlation Coefficient: compute them separately for each pair of X and Y.
- Regression Analysis: If we want to analyze how two or more X variables have an impact on Y, then we will do the same as above for the case of one X but select the data in all the X variables at the same time.
feet)
(square
0.10977
98.24833
price
house
+
=
0.58082
32600.5000
18934.9348
SST
SSR
r
2
=
=
=
0.03297
S
1
b
=
0
50
100
150
200
250
300
350
400
450
050010001500200025003000
Square Feet
House Price ($1000s)
feet)
(square
0.10977
98.24833
price
house
+
=
ProductCaloriesSugar (grams)
Kellogg's20018
Sam's Choice Extra raisin (Wal-Mart)21023
Kountry Fresh (Winn-Dixie)17017
Post Premium19020
American Fare (kmart)17017
America's Choice (A&P)20018
Safeway20018
Kroger20018
General Mills Total18019
Post The Original Shredded Wheat 'N Bran2001
Post The Original Shredded Wheat, Spoon Size1700
Kellogg's Raisin Squares Mini-Wheats18012
Healthy Choice Toasted Brown Sugar Squares1909
Kountry Fresh Frosted Bite Size (Winn-Dixie)20011
Post Frosted Bite Size19012
Kroger Frosted Bite Size19011
Kellogg's Frosted Mini-Wheats20012
Safeway Frosted Bite Size19011