Regression Analysis Assignment
Excel exercise
Today we will estimate regression models with dummy variables, and practice interpreting the results. Go to Canvas and download Excel exercise 9 - wages.xlsx
1. Estimate the following regression model that explains wages.
wage = α + β1 education + β2 experience + β3 tenure + e
Before running any regression in Excel, confirm that the dependent variable is on the far left (column A) and the independent variables are in columns B, C, and D (these variables must be next to one another). Have Excel place the output on a new worksheet titled Regression 1. Job tenure measures how many years a worker has been with their current employer.
2. Holding all else constant, what is the estimated increase in wage for a one year increase in:
a. Education
b. Experience
c. Job tenure
Which of the estimated coefficients are statistically significant?
3. Now run the same regression but generate the variable Female dummy. Remember that the variable will equal 1 if the person is female, 0 otherwise. To create the dummy variable in column E, you will need to use Excel’s =if( ) function to convert the text variable gender to a numeric female dummy variable.
The function below checks the text in cell J2 and returns the value 1 if the text equals female and 0 otherwise. =if(J2=”Female”,1,0) Copy the contents of cell E2 all the way down to the bottom of the data.
4. Now run the regression wage = α + β1 education + β2 experience + β3 tenure + δ4 female dummy + e Be sure the female dummy variable data is in column E and adjust the Input X Range appropriately. Rename the new output worksheet Regression 2.
5. What does the estimated coefficient for the female dummy mean? Is it statistically significant? Does this estimate provide evidence for gender discrimination in the workplace?
6. Now generate dummy variables indicating region a given worker resides in (east, south, midwest, or west). Run the regression from part 4, but include controls for region. What do the estimated coefficients for the region dummies mean? Are they statistically significant?
7. Predict the hourly wage of a female worker living in the Midwest with 12 years of education, 15 years of experience, and 5 years of current job tenure.
8. A person’s hourly wage may not increase by a constant amount with additional years of education, experience, or job tenure. But the regression model in part 1 Implies that, on average, a person’s wage should increase by $1.79 per hour for each additional year of education they obtain. It may be more realistic to assume that a person’s wage increases by a fixed percentage with each additional year of education. If you transform the dependent variable in a regression model and measure it in logarithmic form, the estimated coefficients should be interpreted as semi-elasticities. In other words, the coefficient estimates tell us the expected percentage change in a person’s wage if they obtain an additional year of education.
Rerun the regression model in part 1, but use the natural log of hourly wage as the dependent variable.
Log[wage] = α + β1 education + β2 experience + β3 tenure + e
The formula for the natural log operation in Excel is simply: