CORRELATION AND SIMPLE REGRESSION

profileboyupe
quant_summer_2013_midterm1.docx

Name ______________________________ Signature ___________________________

QSO 510 Quantitative Analysis for Decision Making

Summer 2013 Mid-term Exam - Instructor: Dr. Derek Kane

Instructions

1. The exam is due midnight Thursday, July 25.

2. You may either write out the answers to the exam by hand, or you may word-process the answers into the exam document.

3. If you word-process your exam, you may submit it electronically or print it and submit a hard copy.

4. Answer all questions in the context of the problem. General answers are not expected.

5. You must show all steps including formulas used and all calculations done to arrive at the final answers. Incomplete solutions will receive partial credit.

6. Use at least four significant digits at all intermediate steps. Round off the final answers appropriately. Note: 0.0042 is only two significant digits as leading zeros are not considered significant. Trailing zeros are considered significant.

7. You only need to do four problems. If you do all five problems indicate which four you want me to grade.

8. You are welcome to ask questions you have on the problems. Please do not ask any questions relating to the solution of any problem.

9. It is not forbidden to work with other people on the exam. You are expected to submit your own answers to the questions.

(For Instructor’s use)

Problem

Points

1

2

3

4

5

Total

Problem 1 (25 points)

You have an assembly line which produces 1L bottles of soda with a standard deviation of 0.05L.

a) Assuming the distribution of volume is normal, what is the chance any single bottle’s volume is greater than 1.1.?

b) If you chose 100 bottles at random, what would be the expected average volume of the bottles in your sample? What would the standard deviation of the sample average be? What is the shape of the distribution of the sample average? Give reasons for your answers .

c) If you pulled a sample of 50 bottles, what is the chance that you would find the average volume was less than 0.99L?

Write your answers in the space below and continue on the next page.

Problem 2 (25 points)

We want to investigate the US citizens on their children entering politics. The Excel file “Midterm_Data.xslx” contains a tab “Child in Politics” with survey responses to the question “Would you like your child to enter politics?”

(This is simulated data based upon the Gallup poll, http://www.gallup.com/poll/163373/child-avoid-career-politics.aspx )

a) Compute the survey proportion answering yes and the survey standard deviation for the data.

b) Construct a 98% confidence interval for the data.

c) If you are working for CNN, how would you explain the confidence interval to your audience? Does the confidence interval seem small enough? What would you tell your audience about the implications of this data? Provide a clear and complete answer.

d) The Gallup poll found that 31% of US Citizens do not want their children entering politics. Does the result of this poll of 659 people support or conflict with that result?

Begin your answer below and continue on the next page.

Problem 3 (25 points)

The file Midterm  Data.xls has a tab labeled “Son in Politics”. According to the Gallup poll, 37% of US Citizens want their daughter to enter politics, when they are asked about their daughter first. Use the results of the poll in the “Son in Politics” tab to determine whether there is a statistically noticeable difference between the respondents who would like their son to enter politics when they are asked about their son first.

Begin your answer below and continue on the next page.

Problem 4 (25 points)

The file Midterm_Data.xls has a tab labeled “Gasoline versus Flour” which presents historical price data for these two commodities. Using the Gasoline price as the X-value and the Flour price as the Y-value, we will perform a long-form regression analysis on this data.

(a)

Compute the values of, , SSXX, SSYY, SSXY.

(b) Determine the regression equation.

(c) Compute the standard error of the estimate se.

(d)

Determine the 95% confidence interval for the price of flour, , when the price of Gasoline is at $5 per gallon.

(e) Compute the value of correlation coefficient. Is it reasonable for these two commodities to have a correlation of this value?

Begin your answer in the space below and continue on the next page

Problem 5 (25 points)

The file Midterm  Data.xls has a tab labeled “Many vs. NASDAQ” which presents historical price data for several stocks and a high volume trading condition (VIDX = 1 if the NASDAQ volume is greater than 80% of its maximum). Create a multiple regression model of the NASDAQ using the other asset prices and volume criterion as the independent (x) variables.

Answer the following questions based on the Excel output report. Support your answers with numbers from the output report. Use level of significance = 0.05.

a) Write the estimated multiple regression equation. Note: Use actual variable names and numbers. If using symbols, define them before using in the equation.

b) Clearly explain the meaning of b1 (the coefficient of Dow Chemical). Note: Use actual variable names and numbers in answering your question. b1 is the slope is not a sufficient answer.

c) Clearly explain the meaning of b2 (the coefficient of Exxon-Mobil). Note: Use actual variable names and numbers in answering your question. b2 is the slope is not a sufficient answer.

d) Clearly explain the meaning of b3 (the coefficient of Johnson and Johnson). Note: Use actual variable names and numbers in answering your question. b3 is the slope is not a sufficient answer.

e) Clearly explain the meaning of b4 (the coefficient of Union Pacific). Note: Use actual variable names and numbers in answering your question. b4 is the slope is not a sufficient answer.

f) Clearly explain the meaning of b5 (the coefficient of the High Volume Criteria). Note: Use actual variable names and numbers in answering your question. b5 is the slope is not a sufficient answer.

g) Is the regression equation significant? Give reasons for your answer. (Hint: The answer to this question requires test of the hypothesis: Ho: 1 = 2 = 3 = 4 = 5 = 0 vs. Ha: At least one j is not equal to zero, where j = 1…5)

h) Which variables in the current equation are significant and which are not significant? Give reason for your answer. (Hint: The answer to this question requires test of hypothesis: Ho: j = 0 vs. Ha: j 0 for j = 1…5).

i) Eliminate all of the insignificant variables and show the final regression equation.

j) Considering the original regression equation from a), if you already have a NASDAQ fund in your portfolio, what is the best new asset to place in the portfolio? (Remember, you want your portfolio to be as diverse as possible.)

Begin your answer in the space below and continue on the next page

Y

Y

ˆ

X