Urgent Econometrics test

profilewc15920665166
InstrumentalVariablesRegression.pptx

Instrumental Variables Regression

A10.1 yi = β1 + β2xi + ei correctly describes the relationship between yi and xi in the population, where β1 and β2 are unknown (fixed) parameters and ei is an unobservable random error term.

A10.2 The data pairs (xi, yi), i = 1, …, N, are obtained by random sampling. That is, the data pairs are collected from the same population, by a process in which each pair is independent of every other pair. Such data are said to be independent and identically distributed.

A10.3 The expected value of the error term e, conditional on the value of x, is zero.

- If E(e|x) = 0, then we can show that it is also true that x and e are uncorrelated, and that cov(x, e) = 0. Explanatory variables that are not correlated with the error term are called exogenous variables.

- Conversely, if x and e are correlated, then cov(x, e) ≠ 0 and we can show that E(e|x) ≠ 0. Explanatory variables that are correlated with the error term are called endogenous variables.

Linear Regression with Random x’s

A10.4 In the sample, x must take at least two different values.

A10.5 var(e|x) = σ2. The variance of the error term, conditional on any x, is a constant σ2.

A10.6 The distribution of the error term is normal.

Assumption A10.2 states that both y and x are obtained by a sampling process, and thus are random

- This is the only one new assumption on our list

The result that under the classical assumptions, and fixed x’s, the least squares estimator is the best linear unbiased estimator, in a finite sample, or a small sample

- This means is that the result does not depend on the size of the sample

Under assumptions A10.1–A10.6:

- The least squares estimator is unbiased

- The least squares estimator is the best linear unbiased estimator of the regression parameters, and the usual estimator of σ2 is unbiased

- The distributions of the least squares estimators, conditional upon the x’s, are normal, and their variances are estimated in the usual way

- The usual interval estimation and hypothesis testing procedures are valid

If x is random, as long as the data are obtained by random sampling and the other usual assumptions hold, no changes in our regression methods are required

For the purposes of a ‘‘large sample’’ analysis of the least squares estimator, it is convenient to replace assumption A10.3 by:

A10.3* E(e) = 0 and cov(x, e) = 0

Now we can say:

Under assumptions A10.1, A10.2, A10.3*, A10.4, and A10.5, the least squares estimators:

- Are consistent.

- They converge in probability to the true parameter values as N→∞.

- Have approximate normal distributions in large samples, whether the errors are normally distributed or not.

- Our usual interval estimators and test statistics are valid, if the sample is large.

- If assumption A10.3* is not true, and in particular if cov(x,e) ≠ 0 so that x and e are correlated, then the least squares estimators are inconsistent.

- They do not converge to the true parameter values even in very large samples.

None of our usual hypothesis testing or interval estimation procedures are valid

FIGURE 10.1 (a) Correlated x and e

FIGURE 10.1 (b) Plot of data, true and fitted regression functions

The statistical consequences of correlation between x and e is that the least squares estimator is biased — and this bias will not disappear no matter how large the sample

- Consequently the least squares estimator is inconsistent when there is correlation between x and e

Endogeniety: When an explanatory variable and the error term are correlated, the explanatory variable is said to be endogenous

- This term comes from simultaneous equations models

- It means ‘‘determined within the system’’

- Using this terminology when an explanatory variable is correlated with the regression error, one is said to have an ‘‘endogeneity problem’’

1. Measurement Error: The errors-in-variables problem occurs when an explanatory variable is measured with error

- If we measure an explanatory variable with error, then it is correlated with the error term, and the least squares estimator is inconsistent

Cases in Which x and e are Correlated

Let y = annual savings and x* = the permanent annual income of a person

A simple regression model is:

- Current income is a measure of permanent income, but it does not measure permanent income exactly.

- It is sometimes called a proxy variable

- To capture this feature, specify that:

Substituting:

(10.3)

In order to estimate Eq. 10.3 by least squares, we must determine whether or not x is uncorrelated with the random disturbance e

- The covariance between these two random variables, using the fact that E(e) = 0, is:

(10.4)

The least squares estimator b2 is an inconsistent estimator of β2 because of the correlation between the explanatory variable and the error term

- Consequently, b2 does not converge to β2 in large samples

= In large or small samples b2 is not approximately normal with mean β2 and variance

2. Simultaneous Equation Bias: Another situation in which an explanatory variable is correlated with the regression error term arises in simultaneous equations models

- Suppose we write:

(10.5)

There is a feedback relationship between P and Q

- Because of this, which results because price and quantity are jointly, or simultaneously, determined, we can show that cov(P, e) ≠ 0

- The resulting bias (and inconsistency) is called the simultaneous equations bias

3. Omitted Variable Bias: When an omitted variable is correlated with an included explanatory variable, then the regression error will be correlated with the explanatory variable, making it endogenous

- Consider a log-linear regression model explaining observed hourly wage:

- What else affects wages? What have we omitted?

We might expect cov(EDUC, e) ≠ 0

- If this is true, then we can expect that the least squares estimator of the returns to another year of education will be positively biased, E(b2) > β2, and inconsistent

- The bias will not disappear even in very large samples

Estimating our wage equation, we have:

- We estimate that an additional year of education increases wages approximately 10.75%, holding everything else constant

- If ability has a positive effect on wages, then this estimate is overstated, as the contribution of ability is attributed to the education variable

When all the usual assumptions of the linear model hold, the method of moments leads to the least squares estimator

- If x is random and correlated with the error term, the method of moments leads to an alternative, called instrumental variables estimation, or two-stage least squares estimation, that will work in large samples

The kth moment of a random variable Y is the expected value of the random variable raised to the kth power:

- The kth population moment in Eq. 10.7 can be estimated consistently using the sample (of size N) analog:

The method of moments estimation procedure equates m population moments to m sample moments to estimate m unknown parameters

-Example:

Estimators Based on the Method of Moments

The first two population and sample moments of Y are:

Solve for the unknown mean and variance parameters:

And

In the linear regression model y = β1 + β2x + e, we usually assume:

(10.13)

- If x is fixed, or random but not correlated with e, then:

We have two equations in two unknowns:

These are equivalent to the least squares normal equations and their solution is:

- Under "nice" assumptions, the method of moments principle of estimation leads us to the same estimators for the simple linear regression model as the least squares principle

Suppose that there is another variable, z, such that:

- z does not have a direct effect on y, and thus it does not belong on the right-hand side of the model as an explanatory variable

- z is not correlated with the regression error term e

- Variables with this property are said to be exogenous

- z is strongly [or at least not weakly] correlated with x, the endogenous explanatory variable

- A variable z with these properties is called an instrumental variable

If such a variable z exists, then it can be used to form the moment condition:

(10.16)

Use Eqs. 10.13 and 10.16, the sample moment conditions are:

Solving these equations leads us to method of moments estimators, which are usually called the instrumental variable (IV) estimators:

These new estimators have the following properties:

- They are consistent, if z is exogenous, with E(ze) = 0

- In large samples the instrumental variable estimators have approximate normal distributions

- In the simple regression model:

These new estimators have the following properties (Continued):

- The error variance is estimated using the estimator:

Note that we can write the variance of the instrumental variables estimator of β2 as:

- Because the variance of the instrumental variables estimator will always be larger than the variance of the least squares estimator, and thus it is said to be less efficient

To extend our analysis to a more general setting, consider the multiple regression model:

- Let xK be an endogenous variable correlated with the error term

- The first K - 1 variables are exogenous variables that are uncorrelated with the error term e - they are ‘‘included’’ instruments

We can estimate this equation in two steps with a least squares estimation in each step

The first stage regression has the endogenous variable xK on the left-hand side, and all exogenous and instrumental variables on the right-hand side

- The first stage regression is:

- The least squares fitted value is:

The second stage regression is based on the original specification:

- The least squares estimators from this equation are the instrumental variables (IV) estimators

- Because they can be obtained by two least squares regressions, they are also popularly known as the two-stage least squares (2SLS) estimators

- We will refer to them as IV or 2SLS or IV/2SLS estimators

The IV/2SLS estimator of the error variance is based on the residuals from the original model:

In the simple regression, if x is endogenous and we have L instruments:

The two sample moment conditions are:

Solving using the fact that , we get:

Sometimes we have more instrumental variables at our disposal than are necessary

- Suppose we have L = 2 instruments, z1 and z2

- Then we have:

We have three sample moment conditions:

The first stage regression is a key tool in assessing whether an instrument is ‘‘strong’’ or ‘‘weak’’ in the multiple regression setting

Suppose the first stage regression equation is:

The key to assessing the strength of the instrumental variable z1 is the strength of its relationship to xK after controlling for the effects of all the other exogenous variables

Suppose the first stage regression equation is:

- We require that at least one of the instruments be strong

Consider the model with an instrumental variable MOTHEREDUC:

To implement instrumental variables estimation using the two-stage least squares approach, we obtain the predicted values of education from the first stage equation and insert it into the log-linear wage equation to replace EDUC

- Then estimate the resulting equation by least squares

- The instrumental variables estimates of the log-linear wage equation are:

Using FATHEREDUC, the first stage equation is:

Table 10.1 First-Stage Equation

The IV/2SLS estimates are:

In a multiple regression model, the coefficients are the effect of a unit change in an explanatory, independent, variable on the expected outcome, holding all other things constant

- In calculus terminology, the coefficients are partial derivatives

The multiple regression model, including all K variables, is:

Think of G = Good explanatory variables, B = Bad explanatory variables and L = Lucky instrumental variables

- It is a necessary condition for IV estimation that L ≥ B

- If L = B then there are just enough instrumental variables to carry out IV estimation

- The model parameters are said to just identified or exactly identified in this case

- The term identified is used to indicate that the model parameters can be consistently estimated

- If L > B then we have more instruments than are necessary for IV estimation, and the model is said to be overidentified

Consider the B first-stage equations:

The predicted values are:

In the second stage of estimation we apply least squares to:

Consider the model with B = 2:

The first-stage equations are:

When testing the null hypothesis H0: βk = c, use of the test statistic

is valid in large samples

- It is common, but not universal, practice to use critical values, and p-values, based on the distribution rather than the more strictly appropriate N(0,1) distribution

- The reason is that tests based on the t-distribution tend to work better in samples of data that are not large

When testing a joint hypothesis, such as H0: β2 = c2, β3 = c3, the test may be based on the chi-square distribution with the number of degrees of freedom equal to the number of hypotheses (J) being tested

- The test itself may be called a “Wald” test, or a likelihood ratio (LR) test, or a Lagrange multiplier (LM) test

-These testing procedures are all asymptotically equivalent

Unfortunately R2 can be negative when based on IV estimates

-Therefore the use of measures like R2 outside the context of the least squares estimation should be avoided

Can we test for whether x is correlated with the error term?

- This might give us a guide of when to use least squares and when to use IV estimators

Can we test if our instrument is valid, and uncorrelated with the regression error, as required?

The null hypothesis is H0: cov(x, e) = 0 against the alternative H1: cov(x, e) ≠ 0

Specification Tests

If null hypothesis is true, both the least squares estimator and the instrumental variables estimator are consistent

- Naturally if the null hypothesis is true, use the more efficient estimator, which is the least squares estimator

If the null hypothesis is false, the least squares estimator is not consistent, and the instrumental variables estimator is consistent

- If the null hypothesis is not true, use the instrumental variables estimator, which is consistent

There are several forms of the test, usually called the Hausman test

Consider the model:

- Let z1 and z2 be instrumental variables for x.

Estimate the model by least squares, and obtain the residuals .

- If there are more than one explanatory variables that are being tested for endogeneity, repeat this estimation for each one, using all available instrumental variables in each regression

Consider the model :

Include the residuals computed in step 1 as an explanatory variable in the original regression,

- Estimate this "artificial regression" by least squares, and employ the usual t-test for the hypothesis of significance

Consider the model:

If more than one variable is being tested for endogeneity, the test will be an F-test of joint significance of the coefficients on the included residuals

A test of the validity of the surplus moment conditions is:

Compute the IV estimates using all available instruments, including the G variables x1=1, x2, …, xG that are presumed to be exogenous, and the L instruments

Obtain the residuals

Regress on all the available instruments described in step 1

A test of the validity of the surplus moment conditions is (Continued):

- Compute NR2 from this regression, where N is the sample size and R2 is the usual goodness-of-fit measure

- If all of the surplus moment conditions are valid, then

- If the value of the test statistic exceeds the 100(1−α)-percentile from the distribution, then we conclude that at least one of the surplus moment conditions restrictions is not valid

Table 10.2 Hausman Test Auxiliary Regression

*

12

iii

yxv

=b+b+

*

iii

xxu

=+

(

)

(

)

*

12

12

122

12

i

yxv

xuv

xvu

xe

=b+b+

=b+b-+

=b+b+-b

=b+b+

(

)

(

)

(

)

(

)

(

)

*

2

22

22

cov,

0

u

xeExeExuvu

Eu

éù

==+-b

ëû

=-b=-bs¹

(

)

(

)

2

2

2

var

bxx

s

=-

å

12

QPe

=b+b+

(

)

2

1234

ln

WAGEEDUCEXPEREXPERe

bbbb

=++++

(

)

(

)

(

)

(

)

(

)

(

)

2

ln0.52200.10750.04160.0008

se 0.1986 0.0141

0.0132 0.0004

WAGEEDUCEXPEREXPER

=-+´+´-´

(

)

th

moment of

k

k

EYkY

=m=

(

)

·

th

ˆ

sample moment of

kk

ki

EYkYyN

=m==

å

(

)

(

)

(

)

2

222

var

YEYEY

=s=-m=-m

(

)

(

)

1

22

22

Population MomentsSample Moments

ˆ

ˆ

i

i

EYyN

EYyN

=m=mm=

=mm=

å

å

ˆ

i

yNy

m==

å

(

)

2

222

222

2

ˆˆ

i

ii

yy

yyNy

y

NNN

-

-

s=m-m=-==

å

åå

%

(

)

(

)

12

00

ExeExyx

=Þé-b-bù=

ëû

(

)

(

)

12

00

iii

EeEyx

=Þ-b-b=

(

)

(

)

12

12

1

0

1

0

ii

iii

ybbx

N

xybbx

N

--=

--=

å

å

(

)

(

)

(

)

2

2

12

ii

i

xxyy

b

xx

bybx

--

=

-

=-

å

å

(

)

(

)

12

00

EzeEzyx

=Þé-b-bù=

ëû

(

)

(

)

12

12

1

ˆˆ

0

1

ˆˆ

0

ii

iii

yx

N

zyx

N

-b-b=

-b-b=

å

å

(

)

(

)

(

)

(

)

2

12

ˆ

ˆˆ

ii

iiii

iiiiii

zzyy

Nzyzy

Nzxzxzzxx

yx

--

-

b==

---

b=-b

å

ååå

åååå

(

)

2

22

2

2

ˆ

~,

zxi

N

rxx

æö

s

bb

ç÷

ç÷

-

èø

å

(

)

2

12

2

ˆˆ

ˆ

2

ii

IV

yx

N

-b-b

s=

-

å

(

)

(

)

(

)

2

2

2

2

2

2

var

ˆ

var

zx

zxi

b

r

rxx

s

b==

-

å

2

1

zx

r

<

122

βββ

KK

yxxe

=++++

L

1221111

KKKLLK

xxxzzv

--

=g+g++g+q++q+

LL

1221111

ˆˆ

ˆˆˆ

ˆ

KKKLL

xxxzz

--

=g+g++g+q++q

LL

*

122

ˆ

βββ

KK

yxxe

=++++

L

(

)

2

122

2

ˆˆˆ

βββ

ˆ

σ

iiKKi

IV

yxx

NK

----

=

-

å

L

111

ˆˆ

ˆ

ˆ

LL

xzz

=g+q++q

L

µ

µ

µ

µ

µ

12

12

1

()0

1

()0

ii

iii

yx

N

xyx

N

-b-b=

-b-b=

å

å

(

)

(

)

(

)

(

)

(

)

(

)

(

)

(

)

2

12

ˆˆ

ˆ

ˆ

β

ˆ

ˆˆ

ˆˆ

ββ

ii

ii

ii

ii

xxyy

xxyy

xxxx

xxxx

yx

--

--

==

--

--

=-

å

å

å

å

ˆ

xx

=

(

)

(

)

2212

ββ0

EzeEzyx

éù

=--=

ëû

(

)

(

)

(

)

12

1122

2123

1

ˆˆ

ˆ

ββ0

1

ˆˆ

ˆ

ββ0

1

ˆˆ

ˆ

ββ0

iii

iii

iii

yxm

N

zyxm

N

zyxm

N

--==

--==

--==

å

å

å

1221111

KKKK

xxxzv

--

=g+g++g+q+

L

·

(

)

(

)

(

)

(

)

(

)

2

9.77510.04890.00130.2677

se 0.42490.0417 0.0012

0.0311

EDUCEXPEREXPERMOTHEREDUC

=+-+

(

)

·

(

)

(

)

(

)

(

)

(

)

2

ln0.19820.04930.04490.0009

se 0.4729 0.0374 0.

0136 0.0004

WAGEEDUCEXPEREXPER

=++-

2

12312

γγγθθ

EDUCEXPEREXPERMOTHEREDUCFATHEREDUCv

=+++++

(

)

·

(

)

(

)

(

)

(

)

(

)

2

ln0.04810.06140.04420.0009

se 0.4003 0.0314 0.

0134 0.0004

WAGEEDUCEXPEREXPER

=++-

12211

GexogenousvariablesBendogenousvariables

GGGGKK

yxxxxe

++

=b+b+b+b++b+

ÁÃÃÃÃÃÃÃÃÃÃÃÃÃÃÃÃÃÃÃÃÂÁÃÃÃÃÃÃÃÃÃÃÃÃÃÃÃÃÃ

ÃÃÃÃÃÂ

LL

12211

,

1,,

GjjjGjGjLjLj

xxxzzv

jB

+

=g+g++g+q++q+

=

LL

K

12211

ˆˆ

ˆˆˆ

ˆ

,

1,,

GjjjGjGjLjL

xxxzz

jB

+

=g+g++g+q++q

=

LL

K

*

12211

ˆˆ

GGGGKK

yxxxxe

++

=b+b+b+b++b+

LL

1221111

GGGGGG

yxxxxe

++++

=b+b+b+b+b+

L

11121211112121

21222221212222

γγγθθ

γγγθθ

GGG

GGG

xxxzzv

xxxzzv

+

+

=++++++

=++++++

L

L

(

)

(

)

ˆˆ

se

kk

tc

=b-b

11122

xzzv

=g+q+q+

11122

ˆˆ

ˆ

ˆ

vxzz

=-g-q-q

12

yxe

=b+b+

12

ˆ

yxve

=b+b+d+

(

)

(

)

0

1

:0no correlation between and

:0correlation between and

Hxe

Hxe

d=

ˆ

k

b

122

ˆ

ˆ

.

ˆˆ

KK

eyxx

=-b-b--b

L

ˆ

e

22

()

.

~

LB

NR

-

c