Financial Engineering 5
References: Villalobos, Luenberger, Faerber
Lecture 12
Models and Parameter Estimation
Lecture Topics • Introduction • Use of Factor Models in CAPM • Quick Review of Linear Regression Models • Examples
Introduction • Most of the models developed in Financial Engineering are
based on assumptions that rely on distribution functions as determined by certain parameters.
• For example, it is common to assume that the returns observed from financial securities follow normal or log-normal distributions.
• In order to completely define these distributions we need to estimate the mean and standard deviations from data of the returns of the securities.
• To have valid estimates we are concerned about issues such as:
– Sample size – Time period we use to take the sample
Google’s Daily ROR
0
20
40
60
80
100
120
140
160
180
200 -0
.0 84
75 19
76
-0 .0
76 50
86 51
-0 .0
68 26
53 26
-0 .0
60 02
2
-0 .0
51 77
86 75
-0 .0
43 53
53 5
-0 .0
35 29
20 24
-0 .0
27 04
86 99
-0 .0
18 80
53 74
-0 .0
10 56
20 48
-0 .0
02 31
87 23
0. 00
59 24
60 2
0. 01
41 67
92 8
0. 02
24 11
25 3
0. 03
06 54
57 8
0. 03
88 97
90 4
0. 04
71 41
22 9
0. 05
53 84
55 4
0. 06
36 27
88
0. 07
18 71
20 5
0. 08
01 14
53
0. 08
83 57
85 6
0. 09
66 01
18 1
0. 10
48 44
50 6
0. 11
30 87
83 2
0. 12
13 31
15 7
0. 12
95 74
48 2
0. 13
78 17
80 8
0. 14
60 61
13 3
M or
e
Google’s Daily ROR
Avg = 0.0023 StDev= 0.0214
Basic Statistics Google
• Average Return 0.002253
• Variance 0.00046
• Std Deviation 0.0215
• Coefficient of Variation 9.523
n xX i∑=
1 )( 2
2
− −
= ∑ n
Xx s i
1 )( 2
− −
= ∑ n
Xx s i
CV x σ
=
Sample Distributions • Suppose that a random and independent sample (x1, . . . , xn) is
taken from an infinite sized population. – The distribution of the mean of the sample is:
• The distribution of the sample mean is Normal (µ, σ2/n).
( )1 2 1 2
1n n
x x xx x x x n n
+ + + = = + + +
[ ] ( )1 2 1E nx n
µ µ µ µ= + + + =
[ ] ( ) 2
1 2 1Var Var nx x x x n n
σ = + + + =
Sample Distributions • Suppose that a n random independent observations (x1, . . . , xn)
are taken from a population. – The sample variance's (s2) expected value, variance and
standard deviation are given by:
where
22 ][E σ=s 22
1 2][Var σ −
= n
s 2 22Stdev[ ] 1
s n
σ= −
( )∑ =
− −
= n
i i xx
n s
1
22
1 1
Estimation Error
• What if we need a sample size such that we are <10%?
Period-Length Effects • Suppose the yearly return of a stock is:
• The yearly return is considered to be the result of the 12 monthly returns:
• Returns are measured in monthly terms, so for small values of 𝒓𝒓𝒊𝒊’s the equation can be expanded:
• Therefore, the yearly rate of return is approximately equal to the sum of the 12 individual monthly returns.
• The same logic is used to calculate the standard deviation:
( )( ) ( )1 2 121 1 1 1yr r r r+ = + + +
1 yr+
1 2 121 1yr r r r+ ≈ + + +
1
n
y i i
r r =
≈∑ 12yr r≈
2 212yσ σ=
Period-Length Effects • Assume monthly returns of a given stock for a period of one year
have the same statistical properties and are mutually uncorrelated.
• Each monthly has the same expected value of and the same variance
• It can also be generalized to any length of period, where the period p is the fraction of a year.
• Therefore the expected return and standard deviation of the period rate of return can be found:
1 12 yr r=
ir r 2σ
1 12 yσ σ=
p yr pr= p ypσ σ=
Period-Length Effects • The effect of the period length on the expected rate of return
and the standard deviation of the period return is shown below:
The values for a 1 year period are normalized to unity for both r and σ
Example • The yearly expected rate of return and standard deviation of a
stock are 12% and 15%. • By applying the formulas previously discussed we get a
standard deviation of the monthly return as 4.3 times the expected rate of return.
• The relative error increases as the period is shortened. • Now lets use daily returns instead of monthly, assuming 250
trading days per year. • The relative error increased and the ratio is now 19.8
• This ratio goes to infinity as the period length approaches zero. • Therefore, rates of return for small periods have high standard
deviations when compared to their expected values.
1 12
p = 1 12 1%r = 1 12 4.33%σ =
1 250
p = 1 250 0.048%r = 1 250 0.95%σ =
1 12
1 12
4.33 r σ
=
1 250
1 250
19.8 r σ
=
Mean Blur • The amplification effect makes the estimation of expected rates
nearly impossible. • It is impossible to measure to achieve an accuracy of 100%
by using historical data. • If we use a different set of N data points, we would obtain a
different value of even if the true mean of the stock remains constant.
• Using data from less than 10 years is not a good estimate. • A good estimate will be a standard deviation of about one-tenth
of the mean value itself. • For the example previously discussed, we would need 156
years of historical data to achieve a good estimate. • However, to get decent estimates of the Stdev, we do not need
such a big sample!
r
r
1
1ˆ n
i i
r r n =
= ∑ ( ) 1
1ˆE E n
i i
r r r n =
= = ∑ r̂ n
σσ =
Example • A common practice is to estimate annual returns and StDev for
stocks based on monthly rates of returns as follows:
• Another common practice is to use the last 36 monthly data points to estimate the mean and the variance.
– The variance for these estimates is given by:
monthlyannual µµ 12=
monthlyannual σσ 12=
( ) 0551.00.00459212 ==annualµ
( )12 0.024592 0.084976annualσ = =
36 ][Var
22 σσ ==
n X 2 2 22 2Stdev[ ]
1 35 s
n σ σ= =
−
Applying Regression Models to Stock Data • One of the applications for regression models is factor models. • In the most simple case, the independent variable (x) is given
by the monthly rate of return for the S&P500, and the dependent variable (y) by the monthly return of the stock being analyzed.
• This model is represented by the equation:
• The β in this equation is the same β that we use in the CAPM to price stocks and to establish the relationship between a stock and the market.
– In this case the market is represented by S&P500 index. • The α represents the average return above and beyond that
predicted by the capital asset pricing model (CAPM), given the portfolio's beta and the average market return.
– Jensen’s alpha.
ii xy εβα ++=
Linear Regression Review • The model:
is a multiple linear regression model with k independent variables.
• The term linear is used because it is a linear function of the unknown parameters
• Models that may seem non-linear such as:
can be transformed to a linear model by defining:
0 1 1 2 2 k ky x x xβ β β β ε= + + + + +
0 1 2, , , , ky β β β β=
0 1 1 2 2 12 1 2y x x x xβ β β β ε= + + + +
1 2 3 12 3 and x x x β β= =
0 1 1 2 2 3 3y x x xβ β β β ε= + + + +
Regression Parameter Estimation • The least squares method is used to estimate the regression
coefficients. • Each observed value yi will be related to a set of values of the
independent variables through the least squares function:
where xij denote the ith observation or level of variable xj.
• It is assumed that ε is a random variable with expected value equal to zero and a constant variance σ2.
• The function L can be minimized with respect to β1, β2, . . . . , βk. • By taking the derivatives with respect to each of these variables
and setting them equal to zero, we get the resulting Least Square Normal Equations.
2 2 1 0
1 1 1
n n n
i j ij i i i
L y xε β β = = =
= = − − ∑ ∑ ∑
Matrix Representation • The Linear Model can be written in matrix notation as y = Xβ + ε
• The least squares normal equations can be represented in matrix form as 𝐗𝐗′𝐗𝐗�𝛃𝛃 = 𝐗𝐗′𝐗𝐗𝐗𝐗 and the solution as �𝛃𝛃 = 𝐗𝐗′𝐗𝐗 −1𝐗𝐗′𝐗𝐗
• The regression model is then presented as �𝐗𝐗 = 𝐗𝐗�𝛃𝛃 • The covariance of the estimators of the regression coefficients
(�̂�𝛽0,�̂�𝛽1, … . . �̂�𝛽𝐾𝐾) is: Cov(�𝛃𝛃) = σ2 𝐗𝐗′𝐗𝐗 −1
• σ2 is estimated by
where p is the number of regression parameters (k + 1) and SSE = SST - SSR
=
nε
ε ε
2
1
ε
1
2
n
β β
β
=
β
11 12 1
21 22 2
1 2
1 1
1
k
k
n n nk
x x x x x x
x x x
=
X
1
2
n
y y
y
=
y
2ˆ E E
SSMS n p
σ = = −
( )2 2
1 1
ˆ n n
E i i i i
SS y y e = =
= − =∑ ∑
2
12
1
n
in i
T i i
y SS y
n =
=
= − ∑
∑
2
1ˆ
n
i i
R
y SS
n =
′ ′= − ∑
β X y
Analysis of Variance (ANOVA) • The results of the analysis of variance are usually displayed in
a table called ANOVA Table:
• A measure of the fitness of the linear model to the data is given by the coefficient of multiple regression which is:
Source of Variation
Sum of Squares
Degrees of Freedom
Mean Square
F0
Regression SSR k MSR MSR / MSE
Error SSE n-p MSE
Total SST n-1
T
R
SS SSR =2
Example • Let’s use the same three years of S&P500 and Coca Cola (KO)
monthly ROR data that we used in the previous lecture to estimate beta.
– See Lecture 12 Examples Excel file.
– Notice that this is the same result for beta that we achieved with the covariance process in the prior lecture.
– We can also perform the regression using Excel’s Analysis Tool Pack regression tool found in the Data tab, as shown in the same Excel spreadsheet example.
0.562200 0.039916
=
X'y( ) 1 0.032023 -0.212831 -0.212831 13.122910
− =
X'X ( ) 1 0.009508ˆ 0.404167
y− = =
β X'X X'
0.009508 0.404167i iy x xα β ε ε= + + = + +
df SS MS F Significance F Regression 1 0.012448 0.012448 7.130881 0.011668 Residual 33 0.057605 0.001746 Total 34 0.070053
Regression Statistics Multiple R 0.421533639 R Square 0.177690608 Adjusted R Square 0.152772142 Standard Error 0.041780505 Observations 35
Example • Our calculated beta (0.404167) does not equal the beta reported
in Yahoo Finance (0.440000). Why?
Assignments • Luenberger Chapters 8 and 9 problems 8.1, 9.2, 9.9.
• Create your own Excel spreadsheet and process for performing the regression process to generate beta.
– Compute the α and β for stocks AA (Alcoa) and HD (Home Depot).
– You can download this data, or use the data found in the Lecture 12 Excel file.
• Begin reading Luenberger Chapter 10.
- Slide Number 1
- Lecture Topics
- Introduction
- Google’s Daily ROR
- Google’s Daily ROR
- Basic Statistics
- Sample Distributions
- Sample Distributions
- Estimation Error
- Period-Length Effects
- Period-Length Effects
- Period-Length Effects
- Example
- Mean Blur
- Example
- Applying Regression Models to Stock Data
- Linear Regression Review
- Regression Parameter Estimation
- Matrix Representation
- Analysis of Variance (ANOVA)
- Example
- Example
- Assignments