Financial Engineering 5

profileshoomoosh
Lecture12dmModelsandParameterEstimation.pdf

References: Villalobos, Luenberger, Faerber

Lecture 12

Models and Parameter Estimation

Lecture Topics • Introduction • Use of Factor Models in CAPM • Quick Review of Linear Regression Models • Examples

Introduction • Most of the models developed in Financial Engineering are

based on assumptions that rely on distribution functions as determined by certain parameters.

• For example, it is common to assume that the returns observed from financial securities follow normal or log-normal distributions.

• In order to completely define these distributions we need to estimate the mean and standard deviations from data of the returns of the securities.

• To have valid estimates we are concerned about issues such as:

– Sample size – Time period we use to take the sample

Google’s Daily ROR

0

20

40

60

80

100

120

140

160

180

200 -0

.0 84

75 19

76

-0 .0

76 50

86 51

-0 .0

68 26

53 26

-0 .0

60 02

2

-0 .0

51 77

86 75

-0 .0

43 53

53 5

-0 .0

35 29

20 24

-0 .0

27 04

86 99

-0 .0

18 80

53 74

-0 .0

10 56

20 48

-0 .0

02 31

87 23

0. 00

59 24

60 2

0. 01

41 67

92 8

0. 02

24 11

25 3

0. 03

06 54

57 8

0. 03

88 97

90 4

0. 04

71 41

22 9

0. 05

53 84

55 4

0. 06

36 27

88

0. 07

18 71

20 5

0. 08

01 14

53

0. 08

83 57

85 6

0. 09

66 01

18 1

0. 10

48 44

50 6

0. 11

30 87

83 2

0. 12

13 31

15 7

0. 12

95 74

48 2

0. 13

78 17

80 8

0. 14

60 61

13 3

M or

e

Google’s Daily ROR

Avg = 0.0023 StDev= 0.0214

Basic Statistics Google

• Average Return 0.002253

• Variance 0.00046

• Std Deviation 0.0215

• Coefficient of Variation 9.523

n xX i∑=

1 )( 2

2

− −

= ∑ n

Xx s i

1 )( 2

− −

= ∑ n

Xx s i

CV x σ

=

Sample Distributions • Suppose that a random and independent sample (x1, . . . , xn) is

taken from an infinite sized population. – The distribution of the mean of the sample is:

• The distribution of the sample mean is Normal (µ, σ2/n).

( )1 2 1 2

1n n

x x xx x x x n n

+ + + = = + + +

 

[ ] ( )1 2 1E nx n

µ µ µ µ= + + + =

[ ] ( ) 2

1 2 1Var Var nx x x x n n

σ = + + + =   

Sample Distributions • Suppose that a n random independent observations (x1, . . . , xn)

are taken from a population. – The sample variance's (s2) expected value, variance and

standard deviation are given by:

where

22 ][E σ=s 22

1 2][Var σ −

= n

s 2 22Stdev[ ] 1

s n

σ= −

( )∑ =

− −

= n

i i xx

n s

1

22

1 1

Estimation Error

• What if we need a sample size such that we are <10%?

Period-Length Effects • Suppose the yearly return of a stock is:

• The yearly return is considered to be the result of the 12 monthly returns:

• Returns are measured in monthly terms, so for small values of 𝒓𝒓𝒊𝒊’s the equation can be expanded:

• Therefore, the yearly rate of return is approximately equal to the sum of the 12 individual monthly returns.

• The same logic is used to calculate the standard deviation:

( )( ) ( )1 2 121 1 1 1yr r r r+ = + + +

1 yr+

1 2 121 1yr r r r+ ≈ + + +

1

n

y i i

r r =

≈∑ 12yr r≈

2 212yσ σ=

Period-Length Effects • Assume monthly returns of a given stock for a period of one year

have the same statistical properties and are mutually uncorrelated.

• Each monthly has the same expected value of and the same variance

• It can also be generalized to any length of period, where the period p is the fraction of a year.

• Therefore the expected return and standard deviation of the period rate of return can be found:

1 12 yr r=

ir r 2σ

1 12 yσ σ=

p yr pr= p ypσ σ=

Period-Length Effects • The effect of the period length on the expected rate of return

and the standard deviation of the period return is shown below:

The values for a 1 year period are normalized to unity for both r and σ

Example • The yearly expected rate of return and standard deviation of a

stock are 12% and 15%. • By applying the formulas previously discussed we get a

standard deviation of the monthly return as 4.3 times the expected rate of return.

• The relative error increases as the period is shortened. • Now lets use daily returns instead of monthly, assuming 250

trading days per year. • The relative error increased and the ratio is now 19.8

• This ratio goes to infinity as the period length approaches zero. • Therefore, rates of return for small periods have high standard

deviations when compared to their expected values.

1 12

p = 1 12 1%r = 1 12 4.33%σ =

1 250

p = 1 250 0.048%r = 1 250 0.95%σ =

1 12

1 12

4.33 r σ

=

1 250

1 250

19.8 r σ

=

Mean Blur • The amplification effect makes the estimation of expected rates

nearly impossible. • It is impossible to measure to achieve an accuracy of 100%

by using historical data. • If we use a different set of N data points, we would obtain a

different value of even if the true mean of the stock remains constant.

• Using data from less than 10 years is not a good estimate. • A good estimate will be a standard deviation of about one-tenth

of the mean value itself. • For the example previously discussed, we would need 156

years of historical data to achieve a good estimate. • However, to get decent estimates of the Stdev, we do not need

such a big sample!

r

r

1

1ˆ n

i i

r r n =

= ∑ ( ) 1

1ˆE E n

i i

r r r n =

 = =    ∑ r̂ n

σσ =

Example • A common practice is to estimate annual returns and StDev for

stocks based on monthly rates of returns as follows:

• Another common practice is to use the last 36 monthly data points to estimate the mean and the variance.

– The variance for these estimates is given by:

monthlyannual µµ 12=

monthlyannual σσ 12=

( ) 0551.00.00459212 ==annualµ

( )12 0.024592 0.084976annualσ = =

36 ][Var

22 σσ ==

n X 2 2 22 2Stdev[ ]

1 35 s

n σ σ= =

Applying Regression Models to Stock Data • One of the applications for regression models is factor models. • In the most simple case, the independent variable (x) is given

by the monthly rate of return for the S&P500, and the dependent variable (y) by the monthly return of the stock being analyzed.

• This model is represented by the equation:

• The β in this equation is the same β that we use in the CAPM to price stocks and to establish the relationship between a stock and the market.

– In this case the market is represented by S&P500 index. • The α represents the average return above and beyond that

predicted by the capital asset pricing model (CAPM), given the portfolio's beta and the average market return.

– Jensen’s alpha.

ii xy εβα ++=

Linear Regression Review • The model:

is a multiple linear regression model with k independent variables.

• The term linear is used because it is a linear function of the unknown parameters

• Models that may seem non-linear such as:

can be transformed to a linear model by defining:

0 1 1 2 2 k ky x x xβ β β β ε= + + + + +

0 1 2, , , , ky β β β β= 

0 1 1 2 2 12 1 2y x x x xβ β β β ε= + + + +

1 2 3 12 3 and x x x β β= =

0 1 1 2 2 3 3y x x xβ β β β ε= + + + +

Regression Parameter Estimation • The least squares method is used to estimate the regression

coefficients. • Each observed value yi will be related to a set of values of the

independent variables through the least squares function:

where xij denote the ith observation or level of variable xj.

• It is assumed that ε is a random variable with expected value equal to zero and a constant variance σ2.

• The function L can be minimized with respect to β1, β2, . . . . , βk. • By taking the derivatives with respect to each of these variables

and setting them equal to zero, we get the resulting Least Square Normal Equations.

2 2 1 0

1 1 1

n n n

i j ij i i i

L y xε β β = = =

 = = − −   ∑ ∑ ∑

Matrix Representation • The Linear Model can be written in matrix notation as y = Xβ + ε

• The least squares normal equations can be represented in matrix form as 𝐗𝐗′𝐗𝐗�𝛃𝛃 = 𝐗𝐗′𝐗𝐗𝐗𝐗 and the solution as �𝛃𝛃 = 𝐗𝐗′𝐗𝐗 −1𝐗𝐗′𝐗𝐗

• The regression model is then presented as �𝐗𝐗 = 𝐗𝐗�𝛃𝛃 • The covariance of the estimators of the regression coefficients

(�̂�𝛽0,�̂�𝛽1, … . . �̂�𝛽𝐾𝐾) is: Cov(�𝛃𝛃) = σ2 𝐗𝐗′𝐗𝐗 −1

• σ2 is estimated by

where p is the number of regression parameters (k + 1) and SSE = SST - SSR

   

   

=

ε ε

 2

1

ε

1

2

n

β β

β

     =      

β 

11 12 1

21 22 2

1 2

1 1

1

k

k

n n nk

x x x x x x

x x x

     =      

X

  

1

2

n

y y

y

     =      

y 

2ˆ E E

SSMS n p

σ = = −

( )2 2

1 1

ˆ n n

E i i i i

SS y y e = =

= − =∑ ∑

2

12

1

n

in i

T i i

y SS y

n =

=

     = − ∑

2

n

i i

R

y SS

n =

     ′ ′= − ∑

β X y

Analysis of Variance (ANOVA) • The results of the analysis of variance are usually displayed in

a table called ANOVA Table:

• A measure of the fitness of the linear model to the data is given by the coefficient of multiple regression which is:

Source of Variation

Sum of Squares

Degrees of Freedom

Mean Square

F0

Regression SSR k MSR MSR / MSE

Error SSE n-p MSE

Total SST n-1

T

R

SS SSR =2

Example • Let’s use the same three years of S&P500 and Coca Cola (KO)

monthly ROR data that we used in the previous lecture to estimate beta.

– See Lecture 12 Examples Excel file.

– Notice that this is the same result for beta that we achieved with the covariance process in the prior lecture.

– We can also perform the regression using Excel’s Analysis Tool Pack regression tool found in the Data tab, as shown in the same Excel spreadsheet example.

0.562200 0.039916  

=    

X'y( ) 1 0.032023 -0.212831 -0.212831 13.122910

−   =    

X'X ( ) 1 0.009508ˆ 0.404167

y−   = =  

  β X'X X'

0.009508 0.404167i iy x xα β ε ε= + + = + +

df SS MS F Significance F Regression 1 0.012448 0.012448 7.130881 0.011668 Residual 33 0.057605 0.001746 Total 34 0.070053

Regression Statistics Multiple R 0.421533639 R Square 0.177690608 Adjusted R Square 0.152772142 Standard Error 0.041780505 Observations 35

Example • Our calculated beta (0.404167) does not equal the beta reported

in Yahoo Finance (0.440000). Why?

Assignments • Luenberger Chapters 8 and 9 problems 8.1, 9.2, 9.9.

• Create your own Excel spreadsheet and process for performing the regression process to generate beta.

– Compute the α and β for stocks AA (Alcoa) and HD (Home Depot).

– You can download this data, or use the data found in the Lecture 12 Excel file.

• Begin reading Luenberger Chapter 10.

  • Slide Number 1
  • Lecture Topics
  • Introduction
  • Google’s Daily ROR
  • Google’s Daily ROR
  • Basic Statistics
  • Sample Distributions
  • Sample Distributions
  • Estimation Error
  • Period-Length Effects
  • Period-Length Effects
  • Period-Length Effects
  • Example
  • Mean Blur
  • Example
  • Applying Regression Models to Stock Data
  • Linear Regression Review
  • Regression Parameter Estimation
  • Matrix Representation
  • Analysis of Variance (ANOVA)
  • Example
  • Example
  • Assignments