Methods of Economic Research Assignment

profilemulwal
2014_mer_assignment.docx

School of Economics

ECO-2A06 Methods of Economic Research Assignment

Question 1 is worth 50% of the total mark; Question 2 is worth 50%.

Question 1 (50%)

The file CIGS 2014.sav contains data collected from a sample of 807 individuals.

The variables are:

CIGS:

The average number of cigarettes smoked per day

EDUC:

Years of schooling

AGE:

Age in years

WHITE:

One if individual is white; zero if non-white

INCOME:

Annual income in pounds

(a) Describe the 500th individual in the sample. [5 marks]

(b) Estimate the following model of daily cigarette consumption:

Report the results. Indicate the level of significance using stars

(** strong signficance p-value <0.01 and * denotes significance p-value <0.05).

[5 marks]

(c) Evaluate the explanatory power of the model. [5 marks]

(d) Is there evidence of a problem of heteroskedasticity in this model? Assume that income is the variable that is “causing” the problem of heteroskedasticity.

[5 marks]

(e) Does the presence of heteroskedasticity violate one of the Gauss Markov assumptions? Explain your answer. What implications would heteroskedasticity have for the results you reported in (b)? [6 marks]

(f) Estimate the following model:

Report the results. Indicate the level of significance using stars

(** strong signficance p-value <0.01 and * denotes significance p-value <0.05).

[5 marks]

(g) Interpret the estimated coefficient of ln(INCOME). Suggest a reason why this variable does not appear to be important in explaining variation in daily cigarette consumption. [5 marks]

(h) Interpret the estimated coefficient of the ‘WHITE’ dummy variable. Using an appropriate test, detect whether there is evidence that race is important in explaining variation in daily cigarette consumption? [8 marks]

(i) Compare this model to the model you estimated in (b). Which is your preferred model? Explain your answer. [2 marks]

(j) Using the model specification in part (b), estimate a weighted regression model. Report the PASW results. Why are the results of this model preferable to those of part (b)? [4 marks]

Question 2 (50% )

Nitrogen Dioxide (NO2) is a pollutant that attacks the human respiratory system and increases the likelihood of respiratory illness. One common cause of nitrogen dioxide is car exhaust.

The file POLLUTION 2014.sav contains data from 500 observations made from October 2001 to August 2003 in the US (data from Carnegie Mellon University archive).

The variables are:

LNO2:

Natural log of the concentration of NO2 (particles)

LCARS:

Natural log of the number of cars per hour

TEMP:

Temperature 2 metres above the ground (degrees C)

TCHNG23:

Temperature difference between 25 metres and 2 metres above the ground (degrees C)

WNDSPD:

Wind speed (metres per second)

WNDDIR:

Wind direction (degrees between 0 and 360)

HOUR:

Hour of day

DAYS:

Number of the day in the sequence of 500 days

(a) Estimate the following model:

Report the results. Indicate the level of significance using stars (** denotes strong significance p<0.01 and * denotes significance p<0.05). [5 marks]

(b) Interpret the coefficients of the explanatory variables and conduct appropriate tests of their individual significance. (Note that there is a non-linear relationship between wind direction and the natural log of the concentration of NO2).

[35 marks]

(c) Do you detect evidence that the two temperature variables jointly affect pollution? [4 marks]

(d) Using the model you estimated in (a), conduct a formal test for serial correlation. If serial correlation were detected, what implications would it have for your results? [6 marks]

1

2

012345

ln()

iiiiiii

CIGSEDUCAGEAGEINCOMEWHITEu

=++++++

bbbbbb

2

20123456

23

tttttttt

LNOLCARSTEMPTCHNGWNDSPDWNDDIRWNDDIRu

=b+b+b+b+b+b+b+

2

012345

iiiiiii

CIGSEDUCAGEAGEINCOMEWHITEu

=++++++

bbbbbb