Finance reserch report based on provided data
Data Analysis – 5 Multiple Regressions
FINA305/405
1
Agenda
Introduction: simple vs multiple regressions
Estimating and interpreting multiple regression results
F-test
Multiple regression example
Practice in Excel
2
Introduction
So far…
More than one X
3
Similarities
Linear combination (model) of .
Ordinary least square estimations (still) minimize the sum of squared residuals.
Hypothesis testing (std, t, CI, etc.) on a single X is still the same.
The explanation for R-squared is still the same.
4
Differences
We can perform multiple t-tests, achieve higher R-squared, and build a more theoretically complete model of Y
The effect of each independent variable on Y is estimated CONDITIONAL on the other independent variables
Calculation of is harder for multiple OLS regressions.
5
Interpreting OLS Estimates in the Multiple Regression Model
Mathematical Intuition
Total vs. partial derivative
Simple regression:
Multiple Regression:
Interpretation of Multiple OLS Estimates, cont’d
Verbal intuition
j is the (marginal/partial) effect of the jth explanatory variable on the dependent variable, holding all the other explanatory variables constant.
7
F test (analysis of variance)
To measure the explanatory power of the whole model (or, equivalently, the significance of the R-squared).
The typical hypotheses:
H0 : there is no relationship between Y and X
H1 : there is relationship between Y and X
F statistics:
Decision rule: given a significance (confidence) level α (1- α), we reject H0 in favor of H1 if the calculated F value exceeds the corresponding critical value
Fk,n-k-1 =
8
Distribution
10
F-test example
A calculated F value in a regression model is 8.5. If there are 20 observations (n=20) and 1 independent variable (k=1) (and a constant), establish if the model is significant at the 5% significance level.
The critical F value from the F distribution tables is 4.41 (95% CI) (verify by looking this up, where numerator DF = 1 and denominator DF = 18).
Since the calculated F value is greater than the critical value (Fmodel > Fcritical at 5%) , we conclude that the model is statistically significant at the 5% significance level.
11
Comparing F and t tests
| Test statistic | H0 | H1 | Results | p-value | significance | decision |
| F | no relationship between Y and all Xs R2=0 | R2≠0 | big F | small p (<0.05) | yes (there is a relationship) | Reject Ho, support Ha |
| small F | big p (>0.05) | no (there is no relationship) | don't reject Ho | |||
| t | no relationship between Y and the X b= 0 | b ≠ 0 | big t (> +2.0 or < -2.0) | small p ( < 0.05) | yes (x is an important predictor) | reject Ho, support Ha |
| small t (< +2.0 and > -2.0) | big p ( > 0.05) | no (x is not an important predictor) | don't reject Ho |
Example: Explaining Birth Weight
Data on N = 1388 individuals
Dependent variable:
Y = birth weight of child, in pounds
Explanatory variables:
X1 = number of cigarettes smoked per day by pregnant mum
X2 = Family income, 1988$USD
NOTE k = ?
13
Example: Excel Output
Explaining Birth Weight
Fitted Regression Line:
Evaluate the following:
Significance of coefficient estimates
R2 and its significance
Do the results accord with common sense and/or formal theory in the areas of economics, general human behaviour, and health?
15
Interpretation: Birth Weight Results
On average, mum having one extra cigarette per day is expected to reduce the baby’s birth weight by 0.029 pounds, ceteris paribus (i.e., holding income constant)
Or (more intuitively)
If we compare individuals with the exact same income, mums who smoke 10 cigarettes per day are expected to have babies that weigh 0.29 pounds less than those of mums who do not smoke.
Interpretation: Birth Weight Results
On average, a family having one extra dollar of annual income is expected to increase the baby’s birth weight by 0.000006 pounds, ceteris paribus (i.e., holding mum’s smoking behaviour constant)
or
If we compare mums with the exact same number of cigarettes smoked per day, mums who have $10,000 more in family income are expected to have babies that weigh 0.06 pounds more.
Practice in Excel
http://www.rbnz.govt.nz/statistics/key-graphs/key-graph-house-price-values
Regression (OLS)
Hypothesis test
Economic significance
dY
dX
b
=
Y
j
X
j
b
¶
=
¶
SUMMARY OUTPUT
Regression Statistics
Multiple R0.172640775
R Square0.029804837
Adjusted R Square0.028403833
Standard Error1.253926044
Observations 1388
ANOVA
dfSSMSFSignificance F
Regression 266.8992532633.44962721.273927.94201E-10
Residual 13852177.6777771.5723305
Total 13872244.57703
CoefficientsStandard Errort StatP-valueLower 95%Upper 95%
Intercept7.3108831550.065561508111.5118207.1822725717.439493739
cigs-0.0289629710.005723551-5.06031464.75E-07-0.040190738-0.0177352
Family Income 5.7978E-061.82424E-063.1781950.0015152.21922E-069.37637E-06
2
1
000006
.
0
029
.
0
31
.
7
ˆ
X
X
Y
+
-
=
029
.
0
ˆ
1
-
=
b
000006
.
0
ˆ
2
=
b