Question

AAZZDD

exam1.docx

Home >Mathematics homework help >Statistics homework help >Question

Mid-Term Exam – In Class Part

This exam is worth a total of 101 points. The questions have the following points:

Questions 1 - 4, 6, 9, 11 – 17, and 24 are worth 2 points each. (28 points)

Questions 5, 7, 8, 10, 18, 19, 21 – 23, 25, and 28 – 30 are worth 3 points each.

(39 points)

Questions 26 and 27 are worth 4 points each. (8 points)

Question 20 is worth 5 points. (5 points)

Question 31 is worth 8 points, each part worth 2 points. (8 points)

Question 32 is worth 5 points. (5 Points)

Question 33 is worth 8 points, each part worth 2 points. (8 points)

1. Which of the following are fundamental challenges when trying to infer causality from observational data?

a. Exogeneity

b. Endogeneity

c. Correlation

d. Math

2. An independent variable is endogenous if:

a. It is correlated with another independent variable in the model.

b. It is correlated with the error term.

c. It is not correlated with the error term.

d. If it is correlated with Y.

3. A data set that consists of a sample of individuals, households, firms, cities, states, countries, or a variety of other units, taken at a given point in time, is called a(n) _____.

	a.	cross-sectional data set
	b.	longitudinal data set
	c.	time series data set
	d.	experimental data set

4. An estimate of is said to be unbiased if

a. is normally distributed

b. The coefficient distribution is narrow.

c. The expected value of the distribution of is equal to the true value.

d. The coefficient distribution is wide.

5. Briefly describe what it means for to be a consistent estimator of .

6. Name the concept: The variance of is the same for every observation.

a. Homoscedasticity

b. Heteroscedasticity

c. Consistency

d. Bias

7. Under the usual assumptions, describe the properties of the OLS estimator of in the presence of heteroscedasticity.

8. What are the consequences of using heteroscedasticity-consistent standard errors when applying OLS estimation in multiple regression?

9. Consider the equation . Write out the following:

The population regression function: ___________________________________.

The sample regression function: _____________________________________.

10. Considering question 9, what is the purpose of the sample regression function?

11. A null and alternative hypothesis are statements pertaining to:

a. Sample parameters

b. Sample statistics

c. Population parameters

d. It depends – In some cases it is population parameters, in others it is the sample statistics.

12. True or False: Type I errors occur when we fail to reject a null hypothesis even when it is false.

13. True or False: Type II errors occur when we fail to reject a null hypothesis even when it is false.

14. The Power of a test is ___________________________________________________.

15. True or False: For a given amount of Type I error we want to minimize the power of the test.

16. True or False: By adding more independent variables into our OLS model, we have a greater chance of getting rid of the endogeneity that exists within the error term.

17. In a case where there is multicollinearity in the model

a. Independent variables have strong linear relationships with each other

b. The variance of the estimates increases when we have multicollinearity.

c. Multicollinearity will lead to bias.

d. Both A and B

e. Both A and C

18. What are the consequences of multicollinearity as it relates to OLS estimation?

19. What is a popular measure of multicollinearity and when does that measure indicate that multicollinearity is strongly present in the explanatory variables of your regression?

20. Fill in the blanks of the following ANOVA table:

Source SS DF MS F P-Value

Regression 24 3 _______ ____ 0.02

Error _____ ________ _______

Total 86 34

21. What is the purpose of the table in question 20? Be specific in your answer.

22. Assume that the true model is . However, suppose that you

estimate the model . What are the properties of the

ordinary least squares estimators of and in the second equation?

23. Assume that the true model is . However, suppose

that you estimate the model . Furthermore, suppose that

> 0 and . Given this information, describe the properties of the

ordinary least squares estimator of in the model that you estimated.

24. If the measurement error is in the independent variable, then

a. We don’t need to worry about bias, the measurement error will be reflected in the error term.

b. The bigger the measurement error, the bigger the variance of the error term.

c. We will have a case of attenuation bias, where the coefficient will be closer to 0 than it should be.

d. We will have a case of attenuation bias, where the coefficient is larger than it should be.

25. Suppose that you have the regression model and

that you want to test . Furthermore, assume n = 100. This is called

a ____________ F-test. In testing this null hypothesis, the numerator degrees of

freedom is ________ while the denominator degrees of freedom is ___________.

26. Consider the regression model of question 25 above. Suppose the null hypothesis is instead . In testing this null hypothesis, the numerator degrees of freedom is ________ while the denominator degrees of freedom is ___________. In the below space write out the restricted model for this test.

27. Interpret the coefficients in the two following regression equations:

a. log(x)

28. In question 27 assume that the adjusted R-square of the first equation is 0.86 while the adjusted R-square of the second equation is 0.75. Which equation would you prefer? Explain your answer.

29. What is an RCT? Name and explain two “biases” in an RCT that would bias the results of an RCT.

30. What is the purpose of the Chow test?

Computer Output Problems:

31. Consider Computer Output #1.

This is a labor economics wage equation. The definitions of the variables should be pretty obvious. This is STATA output.

(a) Briefly give me an interpretation of the coefficients in this wage equation.

(b) What is being tested by the first test command? What is the result of the test?

(d) Without doing any computation, show me how you would construct the F-statistic for conducting the second test.

32. Consider Computer Output #2.

This R output is used for testing the efficiency of financial markets. The definitions of the variables are at the top of Computer Output #2. Suppose I want to test that the returns on stock are not forecastable by means of the use of the listed explanatory variables. Looking at this output, tell me what your conclusion is with respect to financial markets not being forecastable.

33. Consider Computer Output #3.

(a) Use the Part (a) output to answer this question. At what number of years do Major League Baseball players tend to maximize their salaries. Show your work.

(b) Use the Part (b) output to answer this question. Interpret the coefficients on the standardized CRuns and CRBI explanatory variables.

(c) Use the Part (c) output to answer this question. What is being tested here? Interpret the results presented in this output.

(d) Use the Part (d) output to answer this question. Using these two “lm” estimations, you can construct an F-statistic to test an hypothesis of some sort. What is that hypothesis? Without doing any computation, show me how you would construct the F-statistic for conducting the test of hypothesis implied here.