need help

profileA11113
hw1.pdf

Econ 103: Homework 1

Manu Navjeevan

July 31, 2022

Econ 41 Review

1. Discrete Random Variables. Suppose that we are interested in the number of cups of coffee drank by a

(randomly selected) student at UCLA. This quantity can be represented as a random variable Y with

probability mass function:

pY (a) =

 

1 4

if a ∈{0, 1, 2} 1 8

if a = 3

3 32

if a = 4

c if a = 5

0 otherwise

,

where c is an unknown constant.

(a) Explain why the number of cups of coffee drank in a day by a randomly selected student at UCLA

is a random variable.

(b) What is the relevant outcome space of the random variable Y ?

(c) Explain what the distribution of this random variable represents. In other words distribution of

Y assigns a probability to any subset of the outcome space. How do we interpret this probability?

(d) Solve for c. (Hint: Recall that PY (OY ) = 1 so that ∑ a∈OY pY (a) must equal one).

(e) What is the probability that a randomly selected student at UCLA drinks at least 3 cups of coffee

a day, PY (Y ≥ 3)?

(f) What is the expected number of cups of coffee drank per day for a randomly selected student at

UCLA?

2. Continuous Random Variables. Suppose that we are interested in the income of a randomly selected

Angeleno. The distribution of incomes (in tens of thousands of dollars) for residents of Los Angeles

can be described as a random variable, X, with the following pdf.

fX(a) =

 0.11 − ca if 0 ≤ a ≤ 100 otherwise ,

where c is an unkown constant.

1

Page 2

(a) What is the outcome space of X, OX?

(b) Using the relationship

PX(l ≤ X ≤ m) = ∫ m l

fX(a) da,

explain why the pdf must always be weakly positive, fX(a) ≥ 0, for any a ∈ R.

(c) Because PX(OX) = 1 we must have that ∫ 10 0 fX(a) da = 1. Using this fact, solve for c.

(d) What is the expected value of X, E[X]?

(e) What is the variance of X, Var(X)?

3. Variance and Covariance. Let Y be a random variable representing income (in tens of thousands of

dollars) and X be a random variable representing years of education. Suppose that the marginal

distribution of X is described by its probability mass function

pX(x) =

 

0.05 if x ∈{1, 2, . . . , 12}

0.09 if x ∈{13, 14, 15, 16}

0.04 if x ∈{17}

0 otherwise

.

The marginal distribution of Y is described by its probability density function

fY (y) =

 0.1 if 0 ≤ y ≤ 100 otherwise .

(a) What is the expectation of Y , E[Y ]? What is its variance, Var(Y )?

(b) What is the expectation of X, E[X]? What is its variance, Var(X)?

(c) Using E[Y X] = 60 compute the covariance between Y and X, Cov(X,Y ).

(d) Calculate the correlation coefficient between X and Y .

ρY X = Cov(X,Y )

σXσY .

(e) What does this covariance tell us about the relationship between education levels and income? Is

there a positive or negative association?

(f) Should we interpret this result as a causal relationship between education and income? What are

some reasons we may want to refrain from this interpretation?

(g) (Challenge) A common inequality used in econometrics is the Cauchy-Schwarz inequality. It

states that, for any random variables X and Y , and any functions g(·) and h(·),

∣∣E[g(X)h(Y )]∣∣ ≤ √E[g2(X)]√E[h2(Y )]. Use this inequality to show why the correlation coefficient is bounded between negative one and

Page 3

one, −1 ≤ ρXY ≤ 1. (Hint: Try g(x) = x−µX and h(y) = y −µY ).

Introduction to Single Linear Regression

1. Useful Equalities. Recall that in deriving the form of β̂1 we used the following equalities

1

n

n∑ i=1

(Yi − Ȳ )(Xi − X̄) = 1

n

n∑ i=1

YiXi − Ȳ X̄ and 1

n

n∑ i=1

(Xi − X̄)2 = 1

n

n∑ i=1

X2i − (X̄) 2.

Show either one of these equalities (only have to show one or the other).

2. Assumptions for Inference. Suppose we are interested in the relationship between the size of the average

American’s social circle, X, and whether or not they are unemployed, Y . To investigate this relationship

we want to estimate the following regression equation1

Y = β0 + β1X + �, E[�] = E[�X] = 0.

To estimate the regression coefficient parameters we collect a sample of size n, {Yi,Xi}ni=1. Recall that for valid asymptotic inference on our estimates β̂0 and β̂1 we require the following assumptions:

Random Sampling, Homoskedasticity, and Rank condition.

• Random Sampling: Assume that {Y,Xi} are independently and identically distributed from the population of interest, (Yi,Xi)

i.i.d∼ (Y,X).

• Homoskedasticity: Assume that Var(�|X = x) = σ2� for all possible values of x.

• Rank Condition: There must be at least two distinct values of X that appear in the population.

(a) Suppose we collect our sample by only randomly surveying people on UCLA campus. Which

assumption would be violated?

(b) Suppose we collect our sample and find that everyone appears to have exactly one friend. Which

assumption would be violated? Why is this a problem when computing the line of best fit through

our sample?

(c) Suppose random sampling, homoskedasticity, and the rank condition are all satisfied, but n = 10.

Why might inferences based on the approximation

β̂1 −β1 σ̂β1/ √ n ∼ N(0, 1)

not be valid?

3. Hypothesis Testing. Suppose now that we are interested in investigating the relationship between the

size of someone’s social circle, X, and their income (in tens of thousands of dollars), Y . We want to

estimate the following linear regression model

Y = β0 + β1X + �, E[�] = E[�X] = 0. 1Recall that this regression specification corresponds to finding the line of best fit parameters β0,β1 = arg minb0,b1 E[(Y −

b0 − b1X)2] and defining � = Y − β0 − β1X

Page 4

To do so we collect a random sample of size n = 64, {Yi,Xi}64i=1 and find that 1 n

∑n i=1(Xi−X̄)

2 = 100, 1 n

∑n i=1(Yi − Ȳ )(Xi − X̄) = 225, Ȳ = 5.5, and X̄ = 1.5.

(a) Using this information find and interpret β̂1 and β̂0.

(b) After finding β̂1 and β̂1 describe how you would construct the estimated residuals �̂i.

(c) We find that 1 n

∑n i=1 �̂

2 i = 36. Use this and the result that, for n large,

β̂1 −β1 σ̂β1/ √ n ∼ N(0, 1),

to compute the (approximate) probability that, if the true value was given β1 = 0, we would see

a value of |β̂1| equal to or larger than the one that we observed.

(d) Use this result to test, at level α = 0.1, the hypotheses

H0 : β1 = 0 vs. H1 : β1 6= 0

(e) Conduct this test in another fashion by constructing the test statistic t∗ and comparing to either

z0.95 = 1.64 or z0.9 = 1.24 (indicate which value you are comparing the test statistic to).

(f) Construct a 90% confidence interval for β1. How could we use this to conduct the hypothesis test

in part (d)?

(g) Suppose that we find we made an error in our calculation and actually 1 n

∑n i=1(Xi − X̄)

2 = 1. If

all other values stayed the same, how would this change the result of the hypothesis test in part

(d)?