Quantitative Methods and Econometrics
J. Edward Taylor Winter 2018
Carbon dioxide (CO2) emissions are widely believed to be a driver of global climate change. In this problem set you will use cross-section data to test what drives countries’ “carbon footprints,” that is, their CO2 emissions. Is it population, or is income the bigger culprit?
The data set “CO2 by country 2010 sh W18” contains data on a sample of countries’ CO2 emissions, in kilotons; population, in millions; and gross national income (GNI), in millions of US dollars, for the year 2010.
1. Please propose a linear regression model to estimate the effect of national income on predicted CO2 emissions. Propose an economic theory to justify this model of CO2 emissions as a function of income, and explain what the parameters and variables in this model represent.
-To estimate the effect of national income on CO2 emissions I utilized a simple regression model between CO2 emission and GNI. I propose an economic theory stating that CO2 emissions increase as the level of income increases within a country. In other terms, I propose that there is a positive correlation between CO2 emissions and GNI. CO2 Y represented CO2 emissions and X1 represented GNI. My Parameters are b0 and b1 with b0 representing the Y-intercept on the regression line and B1 representing the slope.
Y=b0+b1*X1+e
2. Now estimate this model in Excel, like we did in class, using ordinary least squares. Report and interpret your estimated parameters here. Specifically, what does each parameter estimate tell us?
Simple Regression:
-B1 is the change in Y variable associated with a 1 unit increase in X variable. (But holding all other X Variable constant) B1= .4429. If X variable goes up by 1 million Co2 emissions will increase by .44 kilotons. B0 what you would expect emissions to be if a country had zero national income. When there is no Income B0= 12186.31554.
**Work
B0= 1-(Ybar*Xbar1)=(1-(176083*/370007.74)
B1=summation (x1y)/summation(x1^2) 14314822512285/323994254211664
Y=b0+b1*X1+e
3. What is the estimated elasticity of CO2 emissions with respect to GNI? (Always evaluate elasticities at the means of the two variables, unless we tell you otherwise.)
Elasticity of Simple regression for X1(GNI)= b1*(X bar/Y bar) .443/(176083.03/370007.74) = .93079
For each percent increase in GNI, there is a .93 increase in CO2 emissions.
4. Propose an economic theory to justify adding the population variable to your regression model, briefly describing the theory and its assumptions and showing us what your multiple regression model looks like.
I propose an economic theory stating that in addition to Income, CO2 emissions increase as the level of population increases within a country. This extra variable will further help predict each countries CO2 emissions. As population grows there are more people producing income and producing CO2. Also, with more people its assumed that CO2 emissions will grow as there will be more people driving cars or doing something that emits CO2.
CO2=bo+b1GNI+b2POP+ei
Y=bo+b1*X1+b2*X2+ei
Y=CO2, X1=GNI, X2=Pop
Parameters: bo,b1,b2
5. Now expand your Excel spreadsheet and use OLS to estimate your multiple regression model. Report and interpret your results.
Multiple Regression:
-B0: -52681.12849
B0 is negative because it’s what you would expect carbon emission to be if Population and GNI were zero. Clearly this isn’t realistic, so the interpretation doesn’t make any economical since. This is likely to be because GNI and POP aren’t in a range that falls near zero. This intercept does not have much meaning here.
-B1: 0.29651
B1 is the change in Y variable associated with a 1 unit increase in the X1 variable. With a 1 million increase in GNI there is an increase in CO2 Emissions by .29651 Kilotons.
-B2: 3127.3655
B2 is the change in Y variable associated with a 1 unit increase in the X2 variable.
With a 1 million increase in population there is an increase in CO2 emissions by 3127.3655 kilotons.
Population has a greater effect on CO2 emissions.
6. What is the estimated elasticity of CO2 emissions with respect to population?
-With respect to population the estimated elasticity of CO2 emissions is .6761. To calculate this I multiplied the b2 for multiple regression by (X2 bar/ Y bar) 3127.37/(176083.03/38.07). This result indicates that a 1% increase in Population leads to a 0.6761% increase in CO2 emissions.
7. Does the inclusion of population in your regression model affect your estimated elasticity of CO2 emissions with respect to income? Why or why not?
Elasticity of Simple regression for X1(GNI)= b1*(X bar/Y bar)= .93079
Elasticity of Multiple regression for X1(GNI)= b1(Multiple REG)*(X bar/Y bar)= .62308
CO2=bo+b1GNI+ei
CO2=bo+b1GNI+b2POP+ei
In multiple regression B1 will change only if Population is positively correlated with GNI. Sum (GNIi-GNIi(bar)(POPi-POPi(bar)) Not equal to Zero.
They are different. B1 Changes in multiple regression causing the elasticity for X1 to change. If Population is correlated w/ the GNI then adding in population variable will change coefficient b1 on GNI from simple to multiple regression. This change could be because GNI may incorporate some of the effects of populations. By this I mean some of the effects of Population may be absorbed/explained by GNI when GNI is the only variable. When we use both variable in multiple regression the effects of GNI decreases because some of the effects of GNI are caused by population.
8. Based on your findings, what would you conclude is the main driver of countries’ carbon footprints—population or income? Please explain.
-Based on the calculations of elasticities for multiple regressions.
Elasticity of Multiple regression for X1(GNI)= b1(Multiple REG)*(X bar/Y bar)= .62308
Elasticity of Multiple regression for X2(POP)= b2(Multiple REG)*(X bar/Y bar)= .67610
.6761 (X2) >.6231(X1)
**Also look at #5
The variable X2 (Population) is the bigger driver of countries carbon footprints.
9. Based on your regression model, do income and population explain the difference in observed CO2 emissions between China and the United States? Explain.
How well can the right-hand variables predict CO2 emissions for China and US.
|
Yi(China)-Yi(USA) |
=a |
|
[b0+b1GNI(china)+b2POP(china)[-b0-b1GNI(USA)-b2POP(USA)] |
=b |
b/a =Proportion (How well the X variable explains difference in CO2 between China and USA.)
|
A= |
[ 8286891.95-5433056.54]=2853835.41 |
|
B= |
=468673.2621 |
**Work on Excel
B/A= .1642
This shows that the right-hand variables, income and population, explain 16% of the difference in observed CO2 emissions between China and the United States.
10. Compare the R-squared from the simple and multiple regression. Are they different? If so, what would explain the difference?
R-Squared (Simple Regression) (1-(ehat^2/y^2) ) = (1- 14951485335194.100/104778859143114)= .60671
R-Squared (Multiple Regression) (1-(ehat^2/y^2) = (1-41208274131205.7/104778859143114) : .85730
R-Squared = the proportion of the variation in the y variable that can be explained by the x variables. R^2=1-(SSE/TSS) = Proportion that you can predict from model. (SSE/TSS) =What can’t be explained by model.
The R-squared values for simple and multiple regression are different. The multiple regression R^2 is higher because with both variables (X1, X2) CO2 emissions are predicted closer to the actual value. R^2 will almost always be better with multiple regression as there is more variable to help predict the actual value.
For simple regression, GNI is able to explain 61% of the CO2 emissions. In Multiple regression, both GNI and Population are able to explain 86% of the variation in CO2 emissions.
PAGE
2