Advanced Statistical Regression

profileDB-Anon_User
Module1-2Test-edited.pdf

625.661 Statistical Models and Regression

Modules 1-2 Test

Note: You MUST show all work. While math/stat software can be used to check your work, you MUST show how you obtained all answers, steps leading up to your final answer, assumptions needed, etc. The only exception to this is that you MAY use any math/stat software to find the critical value of the normal, t, F, or chi-square distribution. This test must be completed by you alone; help from any other human will be considered cheating.

1. [20 Points] Suppose that n independent paired data points (x1,y1), ..., (xn,yn) satisfy the linear regression model,

y = β0 + β1(x − x̄) + �,

where x is a random regressor, � is uncorrelated with x and has mean zero and variance σ2, and x̄ is the sample mean of the x’s. Let ŷ denote the fitted value of y from this regression model. Prove that the coefficient of determination R2 is equal to the square of the sample correlation between y and ŷ.

2. [20 Points] In a linear regression model,

y = β0 + β1x + �,

where x is a non-random regressor, and � has mean zero and variance σ2. Suppose that n independent paired data points (x1,y1), ..., (xn,yn) satisfy this model. After fitting this model to a set of n = 25 paired data points, we obtain R2 = 0.82,Syy = 50. Construct a 95% prediction value of y at x = x̄, where x̄ is the sample mean of the n = 25 x’s.

3. [40 Points] You are given the following sample of (x,y) data points:

(8, 10), (8, 10), (7, 9), (6, 10), (3, 6), (4, 8), (4, 8), (5, 9), (3, 7), (6, 9).

A simple linear regression model is fitted to these data points.

(a) Estimate the y-intercept and slope.

(b) Estimate the variance of y.

1

(c) Test statistical significance for slope.

(d) Construct a 95% confidence interval of the slope.

(e) Construct an ANOVA table for testing slope.

(f) State any assumptions used in your analyses of (a) - (e) above. Make sure to match any assumption you state with (some combination of) (a) - (e) specifically. Discuss your answers in detail.

4. [20 Points] Let the least-squares residuals be ei = yi −ŷi for i = 1, ...,n, obtained from a simple regression model,

y = β0 + β1x + �,

where ŷi is the predicted value corresponding to yi from this simple regression analysis. Derive the variance of ei and discuss comparing this variance with the variance of the random error �i.

2