Stats Check

profileboyupe
homework_4-1.docx

Names:

Correlation and simple regression

General Guidelines

1. The questions and all the required data are included in this template.

2. Word-process your case assignment within this template. Do not create a new file.

3. Use 12 point regular font and 1.5 line spacing. Do not use italics for your write-up.

4. Use Equation Editor (Insert/ Object/ Microsoft Equation) to word-process the formulas, if required for the case.

5. Use Excel for graphs and the drawing tool in Word for diagrams, if required for the case.

6. Create a single Word document by pasting all your Excel work into this Word document.

7. Use the rules of rounding correctly at all steps including the final answers. Note: 0.12344 is rounded to 0.1234 and 0.12345 is rounded to 0.1235.

8. Do minimum rounding at intermediate steps (use at least four significant digits). Round off the final answers appropriately. Note: 0.0042 is only two significant digits as leading zeros are not considered significant. 0.4200 is four significant digits as trailing zeros are considered significant.

9. You may exchange ideas with other groups but please do not give your report/computer files to any other group or use the reports/computer work done by other groups in writing your own report.

Question #1:

The Excel file Homework 4 Data.xlsx has a tab labeled “Education income” which provides data on the median income of households in each county in the US and the percentage of the population in each county having a high-school education for the same counties. Plot the data with the high school education on the x-axis and the median income on the y-axis:

Determine the linear regression model of median income versus education using income as the dependent (y) variable. Use the long, SSxx, SSyy, etc. method of calculating the model. Write out the linear equation.

Determine the correlation coefficient for this relation. Based on the correlation coefficient, do you believe the linear model implies a relation between income and education?

Is it reasonable to expect that income and education are related?

Question #2:

The tab “Education income” also contains the percentage of the population for each county who have a bachelor’s degree. Use the regression tool to calculate a linear model of the income versus a bachelor’s degree (again use income as the dependent variable). Paste the regression report below and write out the linear equation:

What does the p-value or t-statistic for the relation between income and education imply?

What is the correlation coefficient? What does it imply about the relation between these two variables? (Note the R2 value is not the correlation coefficient.)

What do the two models imply about the relative value of a high school or a college education?