Urgent Econometrics test
Heteroskedasticity
Consider our basic linear function:
- To recognize that not all observations with the same x will have the same y, and in line with our general specification of the regression model, we let ei be the difference between the ith observation yi and mean for all observations with the same xi.
The probability of getting large positive or negative values for e is higher for large values of x than it is for low values
- A random variable, in this case e, has a higher probability of taking on large values if its variance is high.
- We can capture this effect by having var(e) depend directly on x.
When the variances for all observations are not the same, we have heteroskedasticity
- The random variable y and the random error e are heteroskedastic
- Conversely, if all observations come from probability density functions with the same variance, homoskedasticity exists, and y and e are homoskedastic
FIGURE 8.1 Heteroskedastic errors
If there is heteroskedasticity, one of the least squares assumptions is violated:
- Hence the variance is of the form:
where h(xi) is a function of xi that increases as xi increases
Example from food data:
- We can rewrite this as:
FIGURE 8.2 Least squares estimated food expenditure function and observed data points
Heteroskedasticity is often encountered when using cross-sectional data
- The term cross-sectional data refers to having data on a number of economic units such as firms or households, at a given point in time
- Cross-sectional data invariably involve observations on economic units of varying sizes
This means that for the linear regression model, as the size of the economic unit becomes larger, there is more uncertainty associated with the outcomes y
- This greater uncertainty is modeled by specifying an error variance that is larger, the larger the size of the economic unit
Heteroskedasticity is not a property that is necessarily restricted to cross-sectional data
- With time-series data, where we have data over time on one economic unit, such as a firm, a household, or even a whole economy, it is possible that the error variance will change
Two main implications of heteroskedasticity:
- The least squares estimator is still a linear and unbiased estimator, but it is no longer best
- There is another estimator with a smaller variance
- The standard errors usually computed for the least squares estimator are incorrect
- Confidence intervals and hypothesis tests that use these standard errors may be misleading
What happens to the standard errors?
Consider the model:
- The variance of the least squares estimator for β2 as:
Now let the variances differ:
Consider the model:
- The variance of the least squares estimator for β2 is:
(8.8)
If we proceed to use the least squares estimator and its usual standard errors when:
we will be using an estimate of Eq. 8.6 to compute the standard error of b2 when we should be using an estimate of Eq. 8.8
- The least squares estimator, that it is no longer best in the sense that it is the minimum variance linear unbiased estimator
There are two methods we can use to detect heteroskedasticity
- An informal way using residual charts
- A formal way using statistical tests
If the errors are homoskedastic, there should be no patterns of any sort in the residuals
- If the errors are heteroskedastic, they may tend to exhibit greater variation in some systematic way
- This method of investigating heteroskedasticity can be followed for any simple regression
- In a regression with more than one explanatory variable we can plot the least squares residuals against each explanatory variable, or against, , to see if they vary in a systematic way
Detecting Heteroskedasticity
FIGURE 8.3 Least squares food expenditure residuals plotted against income
Consider the general multiple regression model:
(8.9)
A general form for the variance function related to Eq. 8.9 is:
(8.10)
Two possible functions for are:
(8.11)
and
(8.12)
- In this latter case one must be careful to ensure
Note that when
then
is constant and it indicates an absence of heterokedasticity.
The null and alternative hypotheses are:
For the test statistic, use Eq. 8.10 and 8.12 to get:
Letting
we get
This is like the general regression model studied earlier:
- Substituting the least squares residuals for we get:
(8.17)
Since the R2 from Eq. 8.17 measures the proportion of variation in explained by the z’s, it is a natural candidate for a test statistic.
- It can be shown that when H0 is true, the sample size multiplied by R2 has a chi-square (χ2) distribution with S - 1 degrees of freedom:
It is a large sample test
You will often see the test referred to as a Lagrange multiplier test or a Breusch-Pagan test for heteroskedasticity
The value of the statistic computed from the linear function is valid for testing an alternative hypothesis of heteroskedasticity where the variance function can be of any form given by Eq. 8.10
The previous test presupposes that we have knowledge of the variables appearing in the variance function if the alternative hypothesis of heteroskedasticity is true
- We may wish to test for heteroskedasticity without precise knowledge of the relevant variables
- Hal White suggested defining the z’s as equal to the x’s, the squares of the x’s, and possibly their cross-products
Let
- The White test without cross-product terms (interactions) specifies:
- Including interactions adds one further variable
- The White test is performed as an F-test or using:
We test H0: α2 = 0 against H1: α2 ≠ 0 in the variance function σi2 = h(α1 + α2xi)
- First estimate by least squares
- Save the R2 which is:
Calculate:
Since there is only one parameter in the null hypothesis, the χ-test has one degree of freedom.
- The 5% critical value is 3.84
- Because 7.38 is greater than 3.84, we reject H0 and conclude that the variance depends on income
For the White version, estimate:
- Test H0: α2 = α3 = 0 against H1: α2 ≠ 0 or α3 ≠ 0
Calculate:
- The 5% critical value is χ(0.95, 2) = 5.99
- Again, we conclude that heteroskedasticity exists
Estimate our wage equation as:
The Goldfeld-Quandt test is designed to test for this form of heteroskedasticity, where the sample can be partitioned into two groups and we suspect the variance could be different in the two groups
Write the equations for the two groups as:
- Test the null hypothesis:
The test statistic is:
- Suppose we want to test:
- When H0 is true, we have:
The 5% significance level are FLc = F(0.025, 805, 189) = 0.81 and Fuc = F(0.975, 805, 189) = 1.26
- We reject H0 if F < FLc or F > Fuc
Using least squares to estimate (8.20a) and (8.20b) separately yields variance estimates:
- We next compute:
- Since 2.09 > FUC = 1.26, we reject H0 and conclude that the wage variances for the rural and metropolitan regions are not equal
With the observations ordered according to income xi, and the sample split into two equal groups of 20 observations each, yields:
Calculate:
Believing that the variances could increase, but not decrease with income, we use a one-tail test with 5% critical value F(0.95, 18, 18) = 2.22
Since 3.61 > 2.22, a null hypothesis of homoskedasticity is rejected in favor of the alternative that the variance increases with income
Recall that there are two problems with using the least squares estimator in the presence of heteroskedasticity:
-The least squares estimator, although still being unbiased, is no longer best
- The usual least squares standard errors are incorrect, which invalidates interval estimates and hypothesis tests
There is a way of correcting the standard errors so that our interval estimates and hypothesis tests are valid
Under heteroskedasticity:
Heteroskedasticity-Consistent Standard Errors
A consistent estimator for this variance has been developed and is known as:
- White’s heteroskedasticity-consistent standard errors, or
- Heteroskedasticity robust standard errors, or
- Robust standard errors
- The term ‘‘robust’’ is used because they are valid in large samples for both heteroskedastic and homoskedastic errors
For K = 2, the White variance estimator is:
For the food expenditure example:
The two corresponding 95% confidence intervals for β2 are:
White’s estimator for the standard errors helps avoid computing incorrect interval estimates or incorrect values for test statistics in the presence of heteroskedasticity
- It does not address the other implication of heteroskedasticity: the least squares estimator is no longer best
- Failing to address this issue may not be that serious
- With a large sample size, the variance of the least squares estimator may still be sufficiently small to get precise estimates
- To find an alternative estimator with a lower variance it is necessary to specify a suitable variance function
- Using least squares with robust standard errors avoids the need to specify a suitable variance function
Recall the food expenditure example with heteroskedasticity:
(8.26)
- To develop an estimator that is better than the least squares estimator we need to make a further assumption about how the variances σ2i change with each observation
An estimator known as the generalized least squares estimator, depends on the unknown σ2i
- To make the generalized least squares estimator operational, some structure is imposed on σ2i
- One possibility:
(8.27)
Generalized Least Squares: Known Form of Variance
We transform the model into one with homoskedastic errors:
Define the following transformed variables:
(8.29)
The above model then boils down to
(8.30)
The new transformed error term is homoskedastic:
(8.31)
- The transformed error term will retain the properties of zero mean and zero correlation between different observations
To obtain the best linear unbiased estimator for a model with heteroskedasticity of the type specified in Eq. 8.27:
- Calculate the transformed variables given in Eq. 8.29
- Use least squares to estimate the transformed model given in Eq. 8.30
The estimator obtained in this way is called a generalized least squares estimator
One way of viewing the generalized least squares estimator is as a weighted least squares estimator
- Minimizing the sum of squared transformed errors:
- The errors are weighted by
Applying the generalized (weighted) least squares procedure to our food expenditure problem:
- A 95% confidence interval for β2 is given by:
Another form of heteroskedasticity is where the sample can be divided into two or more groups with each group having a different error variance
Most software has a weighted least squares or generalized least squares option
The separate least squares estimates based on separate error variances are:
- But we have two estimates for β2 and two for β3
- Obtaining generalized least squares estimates by dividing each observation by the standard deviation of the corresponding error term: σM and σR:
But σM and σR are unknown
- Transforming the observations with their estimates
- This yields a feasible generalized least squares estimator that has good properties in large samples.
Also, the intercepts are different
- Handle the different intercepts by including METRO
The method for obtaining feasible generalized least squares estimates:
- Obtain estimated and by applying least squares separately to the metropolitan and rural observations.
Let
- Apply least squares to the transformed model
The estimated equation is:
Consider a more general specification of the error variance:
where γ is an unknown parameter
To handle this, take logs
(8.38)
and then exponentiate:
Generalized Least Squares: Unknown Form of Variance
We an extend this function to:
(8.39)
Now write Eq. 8.38 as:
(8.40)
- To estimate α1 and α2 we recall our basic model:
Apply the least squares strategy to Eq. 8.40 using :
For the food expenditure data, we have:
In line with the more general specification in Eq. 8.39, we can obtain variance estimates from:
and then divide both sides of the equation by
This works because dividing Eq. 8.26 by yields:
The error term is homoskedastic:
To obtain a generalized least squares estimator for β1 and β2, define the transformed variables:
and apply least squares to:
(8.44)
To summarize for the general case, suppose our model is:
(8.45)
where:
The steps for obtaining a generalized least squares estimator are:
Estimate Eq. 8.45 by least squares and compute the squares of the least squares residuals
Estimate by applying least squares to the equation
Compute variance estimates
Compute the transformed observations defined by Eq. 8.43, including if
Apply least squares to Eq. 8.44, or to an extended version of (8.44), if
Following these steps for our food expenditure problem:
The estimates for β1 and β2 have not changed much
There has been a considerable drop in the standard errors that under the previous specification were
Robust standard errors can be used not only to guard against the possible presence of heteroskedasticity when using least squares, they can be used to guard against the possible misspecification of a variance function when using generalized least squares
12
()
Eyx
=b+b
12
()
iiiii
eyEyyx
=-=-b-b
2
()0var()cov(,)0
iiij
Eeeee
==s=
var()var()()
iii
yehx
==
83.4210
ˆ
.21
y
x
+
=
ˆ
83.4210.21
iii
eyx
=--
2
12
var()
iiii
yxee
=b+b+=s
2
2
2
1
var()
()
N
i
i
b
xx
=
s
=
-
å
2
12
var()
iiiii
yxee
=b+b+=s
22
22
1
2
2
1
2
1
()
var()
()
N
ii
N
i
ii
N
i
i
i
xx
bw
xx
=
=
=
éù
-s
ëû
=s=
éù
-
êú
ëû
å
å
å
(
)
2
var
ii
e
s
=
ˆ
i
y
(
)
122
βββ
iiKiK
Eyxx
=+++
L
(
)
(
)
(
)
22
122
var
iiiiSiS
yEehzz
saaa
===+++
L
(
)
h
g
(
)
122122
iSiSiSiS
hzzzz
aaaaaa
+++=+++
LL
(
)
(
)
122122
exp
iSiSiSiS
hzzzz
aaaaaa
+++=+++
LL
(
)
0
h
>
g
(
)
(
)
1221
iSiS
hzzh
aaaa
+++=
L
23
0
S
aaa
====
L
023
10
: 0
: not all the in are zero
S
i
H
HH
aaa
a
====
L
(
)
(
)
22
122
var
iiiiSiS
yEezz
saaa
===+++
L
(
)
22
iii
veEe
=-
(
)
22
122
iiiiSiSi
eEevzzv
aaa
=+=++++
L
(
)
122
βββ
iiiiKiKi
yEyexxe
=+=++++
L
2
122
ˆ
iiSiSi
ezzv
aaa
=++++
L
2
ˆ
i
e
2
i
e
(
)
222
1
S
NR
cc
-
=´
:
(
)
12233
βββ
Eyxx
=++
22
22334253
zxzxzxzx
====
523
zxx
=
22
NR
c
=´
2
12
ˆ
iii
exv
aa
=++
2
10.1846
SSE
R
SST
=-=
22
400.18467.38
NR
c
=´=´=
22
123
ˆ
iiii
exxv
aaa
=+++
22
400.188887.555 value0.023
NRp
c
=´=´=-=
·
(
)
(
)
(
)
(
)
(
)
9.9141.2340.1331.524
1.08 0.070 0.015
0.431
WAGEEDUCEXPERMETRO
se
=-+++
1234
1234
ββββ 1, 2, ,
ββββ 1, 2, ,
MiMMMiMMiMMiMiM
RiRRRiRRiRRiRiR
WAGEEDUCEXPERMETROeiN
WAGEEDUCEXPERMETROeiN
=++++=¼
=++++=¼
22
MR
ss
=
(
)
22
,
22
ˆ
ˆ
MMRR
MM
NKNK
RR
FF
ss
ss
--
=
:
2222
01
: against :
MRMR
HH
ssss
=¹
2
2
ˆ
ˆ
M
R
F
s
s
=
22
ˆˆ
31.824 15.243
MR
ss
==
2
2
ˆ
31.824
2.09
ˆ
15.243
M
R
F
s
s
===
22
12
ˆˆ
3574.8 12921.9
ss
==
2
2
2
1
ˆ
12921.9
3.61
ˆ
3574.8
F
s
s
===
(
)
(
)
(
)
2
2
1
2
2
2
1
var
N
ii
i
N
i
i
xx
b
xx
s
=
=
éù
-
ëû
=
éù
-
êú
ëû
å
å
(
)
·
(
)
(
)
2
2
1
2
2
2
1
ˆ
var
2
N
ii
i
N
i
i
xxe
N
b
N
xx
=
=
éù
-
ëû
=
-
éù
-
êú
ëû
å
å
(
)
(
)
(
)
(
)
(
)
(
)
ˆ
83.4210.21
27.46 1.81 White se
43.41 2.09 incorrect se
yx
=+
(
)
[
]
(
)
[
]
22
22
White: se10.212.0241.816.55,13.87
Incorrect: se10.212.0242.095.97,14.45
c
c
btb
btb
±=±´=
±=±´=
(
)
(
)
(
)
(
)
12
2
ββ
0, var, cov,0
iii
iiiij
yxe
Eeeeeij
s
=++
===¹
(
)
22
var
iii
ex
ss
==
12
1
ββ
iii
iiii
yxe
xxxx
æöæö
=++
ç÷ç÷
ç÷ç÷
èøèø
****
12
1
, , = , =
iii
iiiii
iiii
yxe
yxxxe
xxxx
===
****
1122
ββ
iiii
yxxe
=++
(
)
(
)
*22
11
varvarvar
i
iii
ii
i
e
eex
xx
x
ss
æö
====
ç÷
ç÷
èø
(
)
2
2
*212
111
NNN
i
iii
iii
i
e
exe
x
-
===
==
ååå
12
i
x
-
(
)
(
)
(
)
ˆ
78.6810.45
23.79 1.39
ii
yx
se
=+
(
)
[
]
22
ˆˆ
βseβ10.4512.0241.3867.65,13.26
c
t
±=±´=
123
1,2,,
MiMMiMiMiM
WAGEEDUCEXPEReiN
=b+b+b+=
K
123
1,2,,
RiRRiRiRiR
WAGEEDUCEXPEReiN
=b+b+b+=
K
22
ˆˆ
31.82415.243
MR
s=s=
123
123
9.0521.2820.1346
6.1660.956 0.1260
MMM
RRR
bbb
bbb
=-==
=-==
123
1
1,2,,
MiMiMiMi
M
MMMMM
M
WAGEEDUCEXPERe
iN
æöæöæöæöæö
=b+b+b+
ç÷ç÷ç÷ç÷ç÷
sssss
èøèøèøèøèø
=
K
123
1
1,2,,
RiRiRiRi
R
RRRRR
R
WAGEEDUCEXPERe
iN
æöæöæöæöæö
=b+b+b+
ç÷ç÷ç÷ç÷ç÷
sssss
èøèøèøèøèø
=
K
ˆ
when 1
ˆ
ˆ
when 0
Mi
i
Ri
METRO
METRO
s=
ì
ï
s=
í
s=
ï
î
123
1
ˆˆˆˆˆˆ
iiiii
R
iiiiii
WAGEEDUCEXPERMETROe
æöæöæöæöæöæö
=b+b+b+d+
ç÷ç÷ç÷ç÷ç÷ç÷
ssssss
èøèøèøèøèøèø
ˆ
M
s
ˆ
R
s
·
9.3981.1960.1321.539
(se)(1.02)(0.069) (0.015) (0.346)
WAGEEDUCEXPERMETRO
=-+++
22
var()
iii
ex
g
=s=s
22
ln()ln()ln()
ii
x
s=s+g
(
)
22
12
expln()ln()
exp()
ii
i
x
z
s=s+g
=a+a
2
122
exp()
iiSiS
zz
s=a+a++a
L
(
)
2
12
ln
ii
z
s=a+a
(
)
12
ββ
iiiii
yEyexe
=+=++
2
ˆ
i
e
22
12
ˆ
ln()ln()
iiiii
evzv
=s+=a+a+
·
2
ln()0.93782.329
ii
z
s=+
2
11
ˆˆ
ˆ
exp()
ii
z
s=a+a
ˆ
i
s
12
1
iii
iiii
yxe
æöæöæöæö
=b+b+
ç÷ç÷ç÷ç÷
ssss
èøèøèøèø
2
22
11
varvar()1
i
ii
iii
e
e
æöæöæö
==s=
ç÷ç÷ç÷
sss
èøèøèø
12
1
ˆˆˆ
ii
iii
iii
yx
yxx
***
æöæöæö
===
ç÷ç÷ç÷
sss
èøèøèø
1122
iiii
yxxe
****
=b+b+
122
iikiKi
yxxe
=b+b++b+
L
2
122
var()exp()
iiisiS
ezz
=s=a+a++a
L
2
122
ˆ
ln
iiSiSi
ezzv
=a+a++a+
L
2
122
ˆˆˆ
ˆ
exp()
iiSiS
zz
s=a+a++a
L
3
,,
iiK
xx
**
K
2
K
>
12
,,,
S
aaa
K
ˆ
76.0510.63
(se)(9.71)(0.97)
i
yx
=+
(
)
(
)
12
ˆˆ
se
β23.79 and seβ1.39
==