Urgent Econometrics test

profilewc15920665166
Lecture03.pptx

Interval Estimation and Hypothesis Testing

Estimates are of two type:

1. Point estimates

- The estimate b2 is a point estimate of the unknown population parameter in the regression model.

2. Interval estimates

- Interval estimation provides a range of values within which the true parameter is likely to fall

- This range gives an idea about what the parameter value is likely to be, and the precision with it is estimated

- Such intervals are often referred to as confidence intervals.

Interval Estimation

The distribution (normal ) of b2 which is the least squares estimator of β2, is given by

The standardized version of the above distribution can be written as

It is known that:

Substituting Z from the second equation on the previous slide we get

Rearranging we get

The two end-points provide an interval estimator.

If the sampling is repeated many (large!) time, 95% of the intervals constructed this way will contain the true value of the parameter β2.

This interval estimator is based on the assumption SR6 and that the error variance σ2 is known.

In a finite sample is replaced by and so we have

If assumptions SR1-SR6 hold in the simple linear regression model, then

The t-distribution is a bell shaped curve and centered at zero

It looks similar to the standard normal distribution, however it is more spread out. That is, it has a larger variance and thicker tails

The shape of the t-distribution is controlled by a single parameter called the degrees of freedom (df)

Figure 3.1 Critical values from a t-distribution

A critical value from a t-distribution can be found so that:

where α is a probability often taken to be α = 0.01 or α = 0.05.

The critical value tc for degrees of freedom m is the percentile value t(1-α/2, m)

Each shaded ‘‘tail’’ area contains α/2 of the probability, so that 1-α of the probability is contained in the center portion.

- This means, we can make the probability statement

or

or

Once bk and se(bk) are estimated based on a particular data sample, bk ± tcse(bk) is referred to as 100(1-α)% interval estimate of bk.

Important points to note:

- The properties of the interval estimation procedure are based on the notion of repeated sampling

- Any particular interval estimate based on just one sample of data may or may not contain the true parameter βk. As βk is unknown, there is no way of knowing whether it does or does not

- In discussing confidence intervals one must appreciate that our confidence is in the procedure used to construct the interval estimate and not in any one interval estimate calculated from a particular sample of data

In the case of food expenditure data

- The critical value tc = 2.024, which holds for  = .05 and 38 degrees of freedom

- In order to construct an interval estimate for 2 we use the least squares estimate b2 = 10.21 and its standard error

Example

A 95% confidence interval estimate for 2 is given by

- When the procedure we used is applied to many random samples of data from the same population, then 95% of all the interval estimates constructed using this procedure will contain the true parameter

Is β2 actually in the interval [5.97, 14.45]?

- We do not know, and we will never know

- What we do know is that when the procedure we used is applied to many random samples of data from the same population, then 95% of all the interval estimates constructed using this procedure will contain the true parameter

- In other words - The interval estimation procedure works 95% of the time

Is β2 actually in the interval [5.97, 14.45]?

- We do not know, and we will never know

- What we do know is that when the procedure we used is applied to many random samples of data from the same population, then 95% of all the interval estimates constructed using this procedure will contain the true parameter

The interval estimation procedure ‘‘works’’ 95% of the time

Given the reliability of the procedure, the best thing one can say that he/she would be ‘‘surprised’’ if β2 is not in the interval [5.97, 14.45].

The point estimate alone gives no sense of reliability

- Interval estimates include both the point estimate and the standard error (which measures the variability of the least squares estimator) of the estimate.

Table 3.1 Least Squares Estimates from 10 Random Samples

Table 3.2 Interval Estimates from 10 Random Samples

Hypothesis testing procedures compare a conjecture we have about a population to the information contained in a sample of data

- Hypotheses are formed about economic behavior based on an economic and statistical model

- These hypotheses are then used to make statements about model parameters

- In order to draw a conclusion about the hypothesis, hypothesis tests use the information about a parameter that is contained in a sample of data under study, its least squares point estimate, and its standard error.

Hypothesis Testing

Components of Hypotheses Tests

A null hypothesis H0

An alternative hypothesis H1

A test statistic

A rejection region

A conclusion

A null hypothesis is the belief we retain until the sample evidence convinces us that it is not true. If this turns out to be the case then we reject the null hypothesis

The null hypothesis is stated as H0 : βk = c, where c is a constant, and is an important value in the context of a specific regression model

Every null hypothesis has a logical alternative hypothesis H1 that we will accept if the null hypothesis is rejected

- The alternative hypothesis is usually flexible and depends to some extent on economic theory

Possible Alternative hypotheses are:

H1 : βk > c

H1 : βk < c

H1 : βk ≠ c

Based on the value of a test statistic one decides whether to reject the null hypothesis or not to reject it

A test statistic has a special characteristic: its probability distribution is completely known when the null hypothesis is true, and it has some other distribution if the null hypothesis is not true

The test statistic is given by:

- If the null hypothesis H0 :bk= c is true, then we can substitute c for and we have:

(3.7)

- If the null hypothesis is not true, then the t-statistic in the above equation does not have a t-distribution with N-2 degrees of freedom

The rejection region depends on the form of the alternative

- To reject the null hypothesis one needs to look at the range of values of the test statistic

- A rejection region can be constructed only if we have:

- A test statistic whose distribution is known when the null hypothesis is true

- An alternative hypothesis

- A level of significance

The rejection region consists of values that are unlikely and that have low probability of occurring when the null hypothesis is true

- The chain of logic is:

‘‘If a value of the test statistic is obtained that falls in a region of low probability, then it is unlikely that the test statistic has the assumed distribution, and thus it is unlikely that the null hypothesis is true’’

In case the alternative hypothesis is true, the values of the test statistic will tend to be unusually large (this is based on probability α) or unusually small. The level of significance of the test α is usually chosen to be 0.01, 0.05 or 0.10

There are two types of error:

- Type I error: When we reject the null hypothesis when in fact it is true, then we commit a Type I error

The level of significance of a test is the probability of committing a Type I error

P(Type I error) = α

- Type II error: When we do not reject a null hypothesis that is false, we have commit a Type II error

Types of error

It is a good practice to avoid saying that you ‘‘accept’’ the null hypothesis

The standard practice is to say what the conclusion means in the economic context of the problem you are working on and the economic significance of the finding

Common Practice

To have a rejection region for a null hypothesis, we need:

A test statistic

A specific alternative

A level of significance, α, for the test

Rejection Regions for Specific Alternatives

When testing the null hypothesis H0:βk = c against the alternative hypothesis H1:βk > c, reject the null hypothesis and accept the alternative hypothesis if

t ≥ t(1-α;N-2)

Figure 3.2 Rejection region for a one-tail test of H0:βk = c against H1:βk > c

When testing the null hypothesis H0:βk = c against the alternative hypothesis H1:βk < c, reject the null hypothesis and accept the alternative hypothesis if

t ≤ t(1-α;N-2)

Figure 3.3 Rejection region for a one-tail test of H0:βk = c against H1:βk < c

When testing the null hypothesis H0:βk = c against the alternative hypothesis H1:βk ≠ c, reject the null hypothesis and accept the alternative hypothesis if

t ≤ t(α/2;N-2) or t ≥ t(1-α/2;N-2)

Figure 3.4 Rejection region for a test of H0:βk = c against H1:βk ≠ c

Determine the null and alternative hypotheses.

Specify the test statistic and its distribution if the null hypothesis is true.

Select α and determine the rejection region.

Calculate the sample value of the test statistic.

State your conclusion.

Step by step procedure

The null hypothesis is H0:β2 = 0

The alternative hypothesis is H1:β2 > 0

The test statistic is equation 3.7 above

In this case c = 0, so t = b2/se(b2) ~ t(N – 2) if the null hypothesis is true

Select α = 0.05

The critical value for the right-tail rejection region is the 95th percentile of the t-distribution with N – 2 = 38 degrees of freedom, t(0.95,38) = 1.686.

Thus we reject the null hypothesis if the calculated value of t ≥ 1.686.

If t < 1.686, we do not reject the null hypothesis.

Examples of Hypothesis Tests

In the food expenditure data, it was found that b2 = 10.21 with standard error se(b2) = 2.09

- Therefore, the value of the test statistic is:

Since t = 4.88 > 1.686, we reject the null hypothesis that β2 = 0 and accept the alternative that β2 > 0

- In other words, we reject the hypothesis that there is no relationship between income and food expenditure, and conclude that there is a statistically significant positive relationship between household income and food expenditure

The null hypothesis is H0:β2 ≤ 5.5

- The alternative hypothesis is H1:β2 > 5.5

The test statistic is t = (b2 - 5.5)/se(b2) ~ t(N – 2) if the null hypothesis is true

Select α = 0.01

- The critical value for the right-tail rejection region is the 99th percentile - - of the t-distribution with N – 2 = 38 degrees of freedom, t(0.99,38) = 2.429

- Thus we will reject the null hypothesis if the calculated value of t ≥ 2.429

If t < 2.429, we will not reject the null hypothesis

One tail test

For the food expenditure data, the value of the test statistic is

As t = 2.25 < 2.429 we do not reject the null hypothesis that β2 ≤ 5.5

- We are not in position to conclude that the new supermarket will be profitable and it’s construction will not begin

The null hypothesis is H0:β2 ≥ 15

- The alternative hypothesis is H1:β2 < 15

The test statistic is t = (b2 - 15)/se(b2) ~ t(N – 2) if the null hypothesis is true

Select α = 0.05

- The critical value for the left-tail rejection region is the 5th percentile of the t-distribution with N – 2 = 38 degrees of freedom, t(0.05,38) = -1.686.

- Thus we will reject the null hypothesis if the calculated value of t ≤ -1.686

- If t > -1.686, we will not reject the null hypothesis

Left tail test

For the food expenditure data, the value of the test statistic is:

Since t = -2.29 < -1.686 we reject the null hypothesis that β2 ≥ 15 and accept the alternative that β2 < 15

- Hence, we can conclude that households spend less than $15 from each additional $100 income on food

The null hypothesis is H0:β2 = 7.5

- The alternative hypothesis is H1:β2 ≠ 7.5

The test statistic is t = (b2 – 7.5)/se(b2) ~ t(N – 2) if the null hypothesis is true

Select α = 0.05

- The critical value for the two-tail rejection region is the 2.5th percentile of the t-distribution with N – 2 = 38 degrees of freedom, t(0.025,38) = -2.024 and the 97.5th percentile t(0.975,38) = 2.024

- Thus we will reject the null hypothesis if the calculated value of t ≥ 2.024 or if t ≤ -2.024

Two tail test

For the food expenditure data, the value of the test statistic is:

Since -2.024 < t = 1.29 < 2.024 we do not reject the null hypothesis that β2 = 7.5

- Based on the sample data it is safe to conclude that households will spend an additional $7.50 per additional $100 income on food.

Testing β2 = 0 using EVIEWS output

If p-value of a test is available, we can determine the outcome of the test by comparing the p-value to the chosen level of significance, α, without looking up or calculating the critical values.

P-values

Reject the null hypothesis when the p-value is less than, or equal to, the level of significance α. That is, if p ≤ α then reject H0. If p > α then do not reject H0.

If t is the calculated value of the t-statistic, then:

if H1: βK > c

p = probability to the right of t

if H1: βK < c

p = probability to the left of t

if H1: βK ≠ c

p = sum of probabilities to the right of |t| and to the left of – |t|

For a right tail test with the null hypothesis is H0: β2 ≤ 5.5

- The alternative hypothesis is H1: β2 > 5.5

- The p-value is

Right-tail test

Figure 3.5 The p-value for a right-tail test

The null hypothesis is H0: β2 ≥ 15

The alternative hypothesis is H1: β2 < 15

The p-value is

Left tail test

Figure 3.6 The p-value for a left-tail test.

The null hypothesis is H0: β2 = 7.5

The alternative hypothesis is H1: β2 ≠ 7.5

The p-value is

Two tail test

Figure 3.7 The p-value for a two-tail test of significance

As an example of a linear combination, if we let c1 = 1 and c2 = x0, then we have

which is just our basic model

Linear Combinations of Parameters

The estimator is unbiased because

The variance of is

Replacing the variance with it’s estimate we have

The standard error of is the square root of the estimated variance

The t-statistic for the linear combination is:

Substituting the t value into P(-tc ≤ t ≤ tc) = 1 – α, we get:

so that the (1 – α)% interval is

For our example, the estimated variances and covariance is:

C Income
C 1884.442 -85.9032
Income -85.9032 4.3818

The estimated variance of our expected food expenditure is:

and the corresponding standard error is:

The 95% interval is then:

or

Based on the estimate we are 95% confidence that the expected food expenditure by a household with $2,000 income is between $258.91 and $316.31

A general linear hypothesis involves both parameters, β1 and β2 and may be stated as:

or, equivalently,

The alternative hypothesis might be any one of the following:

The t-statistic is:

if the null hypothesis is true

- The rejection regions for the one- and two-tail alternatives (i) – (iii) are the same as those described above and in the same way the conclusions are interpreted.

(

)

÷

÷

ø

ö

ç

ç

è

æ

-

å

2

2

2

2

,

~

x

x

N

b

i

s

b

(

)

(

)

22

2

2

~0,1

i

b

ZN

xx

b

s

-

=

-

å

(

)

22

2

2

1.961.960.95

i

b

P

xx

b

s

æö

-

ç÷

-££=

ç÷

ç÷

-

èø

å

(

)

95

.

0

96

.

1

96

.

1

=

£

£

-

Z

P

(

)

(

)

95

.

0

96

.

1

96

.

1

2

2

2

2

2

2

2

=

÷

ø

ö

ç

è

æ

-

+

£

£

-

-

å

å

x

x

b

x

x

b

P

i

i

s

b

s

(

)

å

-

±

2

2

2

96

.

1

x

x

b

i

s

(

)

(

)

(

)

(

)

222222

2

2

2

2

2

~

ˆ

var

ˆ

N

i

bbb

tt

seb

b

xx

bbb

s

-

---

===

-

å

2

s

2

ˆ

s

(

)

(

)

2

,

1

for

~

2

=

-

=

-

k

t

b

se

b

t

N

k

k

k

b

(

)

(

)

2

a

=

-

£

=

³

c

c

t

t

P

t

t

P

(

)

a

-

=

£

£

-

1

c

c

t

t

t

P

(

)

a

b

-

=

÷

÷

ø

ö

ç

ç

è

æ

£

-

£

-

1

c

k

k

k

c

t

b

se

b

t

P

(

)

(

)

1

kckkkck

Pbtsebbtseb

ba

-££+=-

éù

ëû

(

)

(

)

[

]

95

.

0

024

.

2

024

.

2

2

2

2

2

2

=

+

£

£

-

b

se

b

b

se

b

P

b

(

)

(

)

09

.

2

38

.

4

r

a

ˆ

v

2

2

=

=

=

b

b

se

(

)

(

)

[

]

45

.

14

,

97

.

5

09

.

2

024

.

2

21

.

10

2

2

=

±

=

±

b

se

t

b

c

(

)

(

)

2

β

~

se

kk

N

k

b

tt

b

-

-

=

(

)

(

)

2

~

se

k

N

k

bc

tt

b

-

-

=

β

k

(

)

2

2

10.21

4.88

se2.09

b

t

b

===

(

)

2

2

5.5

10.215.5

2.25

se2.09

b

t

b

-

-

===

(

)

2

2

1510.2115

2.29

se2.09

b

t

b

--

===-

(

)

2

2

5.5

10.217.5

1.29

se2.09

b

t

b

-

-

===

(

)

2

2

5.5

10.215.5

2.25

se2.09

b

t

b

-

-

===

(

)

[

]

(

)

[

]

0152

.

0

9848

.

0

1

25

.

2

1

25

.

2

38

38

=

-

=

£

-

=

³

=

t

P

t

P

p

(

)

2

2

15

10.2115

2.29

se2.09

b

t

b

-

-

===-

(

)

[

]

0139

.

0

29

.

2

38

=

-

£

=

t

P

p

(

)

2

2

7.5

10.217.5

1.29

se2.09

b

t

b

-

-

===

(

)

[

]

(

)

[

]

2033

.

0

29

.

1

29

.

1

38

38

=

-

£

+

³

=

t

P

t

P

p

(

)

0

2

0

1

2

2

1

1

|

x

x

y

E

x

c

c

=

=

+

=

+

=

b

b

b

b

l

(

)

(

)

(

)

(

)

1122

1122

1122

ˆ

λ

ββ

λ

EEcbcb

cEbcEb

cc

=+

=+

=+

=

(

)

(

)

(

)

(

)

(

)

2

1

2

1

2

2

2

1

2

1

2

2

1

1

,

cov

2

var

var

var

ˆ

var

b

b

c

c

b

c

b

c

b

c

b

c

+

+

=

+

=

l

(

)

(

)

(

)

(

)

(

)

1122

22

11221212

ˆ

varvar

varvar2cov,

cbcb

cbcbccbb

l

=+

=++

(

)

(

)

(

)

1122

1122

ˆ

sese

var

cbcb

cbcb

l

=+

=+

l

ˆ

(

)

[

]

l

l

l

ˆ

var

,

~

ˆ

2

2

1

1

N

b

c

b

c

+

=

(

)

(

)

(

)

(

)

(

)

(

)

11221122

1122

2

ˆ

ˆ

var

ˆ

ˆ

se

ββ

se

~

N

t

cbcbcc

cbcb

t

ll

l

ll

l

-

-

=

-

=

+-+

=

+

(

)

(

)

(

)

(

)

112211221122

11221122

se

ββ

1

α

se

c

c

cbcbtcbcbcc

P

cbcbtcbcb

éù

+-+£+£

=-

êú

+++

êú

ëû

(

)

(

)

11221122

se

c

cbcbtcbcb

+±+

(

)

(

)

(

)

(

)

(

)

2

121212

2

var20var20var220cov,

1884.442204.381822085.9032

201.0169

bbbbbb

+=++´´

=+´+´´-

=

(

)

(

)

1212

se20var20

201.0169

14.1780

bbbb

+=+

=

=

(

)

(

)

(

)

1212

0.975,38

20se20

bbtbb

+±+

(

)

(

)

[

]

[

]

31

.

316

,

91

.

258

1789

.

14

024

.

2

6089

.

287

,

1789

.

14

024

.

2

6089

.

287

=

+

-

(

)

011220

:

ββ

Hccc

+=

(

)

011220

:

ββ0

Hccc

+-=

(

)

(

)

(

)

111220

111220

111220

:

ββ two-tail test

:

ββ right-tail test

:

ββ left-tail test

iHccc

iiHccc

iiiHccc

+>

+<

(

)

(

)

(

)

11220

2

1122

ββ

~

se

ββ

N

ccc

tt

cc

-

+-

=

+