Help with all expect question 2 & 3

Tdg4407
econ262final.xlsx

Questions 1 - 3

PRINCIPLES OF STATISTICS II SPRING 2018
ECON 262
FINAL EXAM
You need to send back your answers (Excel files, PDF, Word files) in Canvas that clearly present your detailed answers.
Since this is a take home exam, I expect a lot of detail in the answers and will set a high bar for grading (higher than Exam One). Review the answers
to the problem sets, and Exam One posted in Canvas to get a feel for what I expect.
ONE WORD ANSWERS WITHOUT ANALYTICAL SUPPORT WILL NOT BE SUFFICIENT TO SCORE WELL ON THIS EXAM.
Due Date: Friday May 11, 2018 at 11:55 p.m. Test answers submitted late will receive a grade of zero.
There are six questions located in the four workbooks.
Make good use of the Lecture Notes, textbook and answers to the Problem Sets and Exam One.
100 Points Total
(20) 1. A retail store manager with Petrie Stores, Inc, wants to develop a multiple regression model to predict
the amount of sales of a product per month Y from monthly advertising expenditures X1 and
whether the month was December (coded 1) or another month (coded 0) X2. The multiple regression Advertising
results are presented below. Expenditures (X1) Sales (Y)
($) ($)
SUMMARY OUTPUT 2,200 14,800
3,000 17,300
Regression Statistics 2,400 8,200
Multiple R 0.8281456316444125 2,000 12,300
R Square 0.6858251872117229 2,100 11,200
Adjusted R Square 0.6488634445307494 2,400 16,000
Standard Error 1779.2905623921506 2,700 11,100
Observations 20.0 2,200 10,000
1,100 14,500
ANOVA 2,100 11,900
df SS MS F 2,400 14,400
Regression 2.0 1.174856266078978E8 5.87428133039489E7 1,300 5,800
Residual 17.0 5.38198733921022E7 3165874.905417776 1,300 12,200
Total 19.0 1.713055E8 2,000 15,300
2,000 12,600
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% 1,300 12,000
Intercept 4081.80939102274 1840.5557922459727 2.2177047869012627 0.040487672121620646 198.57068547602088 7965.048096569459 2,000 9,600
X Variable 1 2.599149950526228 0.8049034203187287 0.004928420748804941 0.9009498040397608 4.297350097012695 2,200 11,400
X Variable 2 4503.013402896468 836.3099302978545 4.943117857197039E-5 2738.5512198714296 6267.475585921506 2,100 10,100
2,800 17,600
(a) Construct a scatterplot between sales (Y) and advertising expenditures (X1). See data to the right.
Provide a standard evaluation of the scatterplot.
(b) Which of the explanatory variables (if any) are statistically significant (at the .05 level) in explaining product
sales per month? Conduct a formal hypothesis test for each estimated regression coefficient.
(c) Give an interpretation (as applicable to the data) for each of the estimated regression coefficients.
(d) How much of the variation of y around ybar can be explained by the regression model?
(e) Calculate the F-Statistic
(f) Conduct an F-Test (.05 level) to test for a globally statistically significant regression equation.
(g) Find the predicted sales per month when advertising expenditures equal $2,000 and the month is July.
(g) Find the predicted sales per month when advertising expenditures equal $2,000 and the month is December.
(i) Provide a brief evaluation of the estimated regression equation. Would the equation be suitable for forecasting?
All calculations for Questions 2 & 3 should be performed by hand (pencil/pen and paper or by Hand in Excel = layout all the calculations).
The only exception is the scatterplot which can be constructed using Excel.
(10) 2. A sociologist was hired by a large city hospital to investigate the relationship between the number of
unauthorized days that employees are absent per year Y, and the distance (miles) X1 between home
and work for the employees. A sample of 5 employees was chosen, and the following data were collected.
Distance to Number of
Work Days Absent
(X1) (Y)
1.0 8.0
3.0 5.0
8.0 6.0
12.0 5.0
18.0 2.0
(a) Construct a scatterplot using Excel. Provide a standard evaluation of the scatterplot.
(b) Estimate the slope coefficient and intercept term for the regression line that relates number
of absent days to distance to work by hand. Express the estimated equation in standard form.
(c) Interpret (as applicable to the data) the estimated slope coefficient and intercept term.
(d) Compute the Errors, Sum of Squared Errors (SSE), Mean Square Error (MSE) and Regression Standard Error (SE).
(10) 3. Based on the estimated regression equation in Question Two and the calculated SSE, MSE & SE, does a
statistically significant (.05 level) relationship exist between the number of days absent
and the distance to work? Hint: Perform a hypothesis test on the slope coefficient.
HAVE FUN WITH THIS EXAM

Question 4

PRINCIPLES OF STATISTICS II SPRING 2018
ECON 262
FINAL EXAM
On the Computer
(20) 4. The following data reports annual wages (Y) for a sample of 100 workers. Also included are explanatory variables
relating to years of experience of the workers (X1), age of the workers (X2), years of education (X3) and whether or not
the worker is a union member (X4, 0 = non union member, 1 = union member)
1) Conduct a full regression analysis of this data set. Determine the "Best" equation to predict wages (Y) based on the potential explanatory variables.
Your analysis should include (but not limited to) scatterplots and evaluations, simple regressions and evaluations that should include hypothesis tests on the slope coefficients,
interpretations of the Rsquare, multiple regression analysis and hypothesis tests on the slope coefficients, Global F-Test, drop insignificant variables, re-estimated equations and re-evaluation
until you find the best regression equation to predict annual wages (Y). Use a .05 significance level for hypothesis tests related to individual coefficients (t-tests) and Global F-Tests.
2) Based on you "Best" equation to predict wages (Y) in part 1, predict wages (y) based on the available forecasts of the potential explanatory variables at the bottom of the data set.
Provide all estimated regression equations and scatterplots to support your analysis and answers.
This question and answers will test your knowledge on how to conduct a full regression analysis on a dataset.
Wage (Y) Experience (X1) Age (X2) Education (X3) Union (X4) 0 = no union 1 = union
$19,388 45 57 6 0
$49,898 33 51 12 1
$28,219 12 30 12 0
$83,601 18 41 17 0
$29,736 47 61 8 1
$50,235 12 34 16 0
$45,976 43 61 12 1
$33,411 20 38 12 0
$21,716 11 29 12 0
$37,664 19 43 18 0
$26,820 33 57 18 1
$29,977 6 28 16 0
$33,959 26 49 17 1
$11,780 33 50 11 0
$10,997 0 20 14 0
$17,626 45 63 12 0
$22,133 10 32 16 1
$21,994 24 42 12 0
$29,390 18 37 13 0
$32,138 22 42 14 1
$30,006 27 49 16 0
$68,573 14 36 16 1
$17,694 38 52 8 0
$26,795 44 57 7 0
$19,981 54 64 4 0
$14,476 3 21 12 0
$19,452 3 22 13 0
$28,168 17 36 13 0
$19,306 34 49 9 1
$13,318 25 42 11 1
$25,166 10 28 12 0
$18,121 18 36 12 0
$13,162 6 24 12 1
$32,094 14 32 12 0
$16,667 4 22 12 0
$50,171 39 57 12 1
$31,691 13 31 12 0
$36,178 40 58 12 0
$15,234 4 22 12 0
$16,817 26 44 12 0
$22,485 22 40 12 0
$30,308 10 28 12 0
$11,702 6 26 14 0
$11,186 0 18 12 0
$12,285 42 60 12 0
$19,284 3 25 16 0
$11,451 8 26 12 0
$57,623 31 52 15 0
$25,670 8 27 13 1
$83,443 5 28 17 0
$49,974 26 48 16 1
$46,646 44 55 5 0
$31,702 39 57 12 0
$13,312 9 27 12 0
$44,543 10 34 18 0
$15,013 21 43 16 0
$33,389 22 42 14 0
$60,626 7 31 18 0
$24,509 15 35 14 0
$20,852 38 56 12 0
$30,133 27 43 10 0
$31,799 25 43 12 0
$16,796 14 32 12 0
$20,793 6 24 12 0
$29,407 19 35 10 0
$29,191 9 27 12 0
$15,957 10 28 12 0
$34,484 28 47 13 0
$35,185 12 32 14 0
$26,614 19 37 12 0
$41,780 9 27 12 0
$55,777 21 41 14 0
$15,160 45 59 8 0
$66,738 29 44 9 0
$33,351 4 26 16 0
$33,498 20 36 10 0
$29,809 29 43 8 0
$15,193 15 33 12 0
$23,027 34 54 14 1
$75,165 12 33 15 0
$18,752 45 62 11 1
$83,569 29 53 18 0
$32,235 38 56 12 0
$20,852 1 19 12 0
$13,787 4 21 11 0
$34,746 15 35 14 0
$17,690 14 32 12 0
$52,762 7 31 18 0
$60,152 38 60 16 0
$33,461 7 29 16 1
$13,481 7 25 12 0
$9,879 28 46 12 0
$16,789 6 25 13 0
$31,304 26 48 16 0
$37,771 5 26 15 0
$50,187 24 42 12 0
$39,888 5 23 12 0
$19,227 15 33 12 0
$32,786 37 54 11 1
$28,440 24 42 12 0
Forecast Variables 15 33 14 1

Question 5

PRINCIPLES OF STATISTICS II SPRING 2018
ECON 262
FINAL EXAM
On the Computer
(20) 5. Oklahoma Land & Agronomics, Inc., is interested in increasing crop production
and has installed a center-pivot, mobile sprinkling system to water an alfalfa crop.
Oklahoma has collected data for the weekly growth of alfalfa Y (in inches) and for
the water flow setting X1 on the mobile sprinkling system (higher settings apply greater
amounts of water). Other conditions are held constant.
Water Flow
Growth Setting
(Y) (X1)
1.6 0.0
2.9 0.3
3.7 0.6
3.2 0.9
3.5 1.3
4.2 1.6
5.3 1.9
5.5 2.2
6.2 2.5
6.3 2.8
6.2 3.2
6.3 3.5
6.5 3.8
7.1 4.1
7.6 4.4
6.7 4.7
6.9 5.1
6.4 5.4
6.7 5.7
6.5 6.0
(a) Construct a scattergram between growth (Y) and water flow setting (x).
Does there appear to be a curvilinear relationship? Evaluate the scatterplot.
(b) Estimate a simple linear regression equation that could be used to predict growth.
Evaluate the estimated regression equation conducting all necessary statistical tests (.05 level).
(c) Create a second explanatory variable (X2) to reflect the quardratic (non-linear) shape to the scatterplot.
(d) Estimate a curvilinear multiple regression equation by including both the X1 and X2 explanatory
variables in the model. Evaluate the estimated regression equation conducting all necessary statistical tests (.05 level).
(e) Which estimated regression equation provides a better fit to the data? Justify your answer.
(f) Calculate the amount of water flow setting that will maximize growth.
(g) What will growth equal at the maximizing point calculated in part f?
(h) Based on your Best regression equation in part e, forecast growth at a water setting of of 1.8?
Provide all estimated regression equations and scatterplots to support your analysis and answers.
You need to email me back your answers (Excel files, PDF, Word files) in Canvas that clearly present your detailed answers.
Since this is a take home exam, I expect a lot of detail in the answers and will set a high bar for grading.
Due Date: Friday May 11, 2018 at 11:55 p.m. Test answers submitted late will receive a grade of zero.
There are five questions located in the three workbooks.
Questions 2 & 3 are to be done by hand (except scatterplots). Show all work.
Make good use of the Lecture Notes, textbook and answers to the Problem Sets.

Question 6

(20) 6. Salsberry Realty sells homes along the east coast of the United States. One of the questions
most frequently asked is "If we buy a home, home much can we expect to pay to heat in
the winter?
To investigate, two scatterplots, two simple linear regressions, and one multiple linear regression were estimated, utilizing
variables thought to be related to January heating costs. A random sample of 20 recently Y temp Age of Furnace
sold homes was taken to collect the data for the regression analysis. The estimated simple 250.0 35.0 3.0 35.0 250.0 3.0 250.0
and multiple linear regressions are presented below: 360.0 29.0 4.0 29.0 360.0 4.0 360.0
165.0 36.0 7.0 36.0 165.0 7.0 165.0
Simple Linear Regression One: 43.0 60.0 6.0 60.0 43.0 6.0 43.0
Y= January Heating Costs ($) 92.0 65.0 5.0 65.0 92.0 5.0 92.0
X = Mean January Outside Temperature (Degrees F) 200.0 30.0 5.0 30.0 200.0 5.0 200.0
355.0 10.0 6.0 10.0 355.0 6.0 355.0
SUMMARY OUTPUT 290.0 7.0 10.0 7.0 290.0 10.0 290.0
Regression Statistics 230.0 21.0 9.0 21.0 230.0 9.0 230.0
Multiple R 0.8115088354934338 120.0 55.0 2.0 55.0 120.0 2.0 120.0
R Square 0.6585465900839089 73.0 54.0 12.0 54.0 73.0 12.0 73.0
Adjusted R Square 0.6395769561996816 205.0 48.0 5.0 48.0 205.0 5.0 205.0
Standard Error 63.55260675759623 ``` 400.0 20.0 5.0 20.0 400.0 5.0 400.0
Observations 20.0 320.0 39.0 4.0 39.0 320.0 4.0 320.0
72.0 60.0 8.0 60.0 72.0 8.0 72.0
ANOVA 272.0 20.0 5.0 20.0 272.0 5.0 272.0
df SS MS F Significance F 94.0 58.0 7.0 58.0 94.0 7.0 94.0
Regression 1.0 140214.94113765802 140214.94113765802 34.715830263412286 1.4070071114966298E-5 190.0 40.0 8.0 40.0 190.0 8.0 190.0
Residual 18.0 72700.80886234199 4038.9338256856663 235.0 27.0 9.0 27.0 235.0 9.0 235.0
Total 19.0 212915.75 139.0 30.0 7.0 30.0 139.0 7.0 139.0
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept 388.8019516599528 34.240843053207534 11.35491760689962 1.2218404739566874E-9 316.864609928764 460.7392933911416 316.864609928764 460.7392933911416
X Variable 1 -4.934192248923462 0.8374372765317352 -5.89201410923398 1.4070071114966295E-5 -6.693582677677995 -3.1748018201689288 -6.693582677677995 -3.1748018201689288
RESIDUAL OUTPUT
Observation Predicted Y Residuals Standard Residuals
1.0 216.10522294763163 33.89477705236837 0.5479487754296747
2.0 245.71037644117243 114.28962355882757 1.8476256438158871
3.0 211.1710306987082 -46.17103069870819 -0.7464087960394269
4.0 92.75041672454512 -49.75041672454512 -0.804273764043705
5.0 68.07945547992779 23.920544520072212 0.38670362271843667
6.0 240.77618419224896 -40.77618419224896 -0.659194782733538
7.0 339.4600291707182 15.539970829281799 0.2512218319938128
8.0 354.2626059174886 -64.2626059174886 -1.0388803019415993
9.0 285.18391443256013 -55.18391443256013 -0.892112619298802
10.0 117.42137796916239 2.5786220308376073 0.04168644572909851
11.0 122.35557021808586 -49.35557021808586 -0.7978906077431704
12.0 151.96072371162666 53.039276288373344 0.8574420314666036
13.0 290.1181066814836 109.8818933185164 1.7763695212613329
14.0 196.3684539519378 123.6315460480622 1.9986487639923738
15.0 92.75041672454512 -20.75041672454512 -0.3354547934126464
16.0 290.1181066814836 -18.118106681483596 -0.29290041807575323
17.0 102.61880122239205 -8.618801222392051 -0.13933301783295002
18.0 191.43426170301433 -1.4342617030143288 -0.02318652052492161
19.0 255.57876093901933 -20.578760939019332 -0.33267977656185105
20.0 240.77618419224896 -101.77618419224896 -1.645331238198868
6. Continued
Simple Linear Regression Two:
Y= January Heating Costs ($)
X = Age of Furnace
```
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.25710133474306734
R Square 0.06610109632666676
Adjusted R Square 0.01421782390037047
Standard Error 105.10359585237413
Observations 20.0
ANOVA
df SS MS F Significance F
Regression 1.0 14073.964500214497 14073.964500214497 1.2740348331067175 0.2738307062372321
Residual 18.0 198841.7854997855 11046.765861099195
Total 19.0 212915.75
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept 275.02917202917206 66.13742595445467 4.158449895201093 5.902807638291656E-4 136.0795961566486 413.97874790169556 136.0795961566486 413.97874790169556
X Variable 1 -10.988845988845991 9.735571162986636 -1.1287315150675639 0.2738307062372315 -31.442522019499688 9.464830041807705 -31.442522019499688 9.464830041807705
RESIDUAL OUTPUT
Observation Predicted Y Residuals Standard Residuals
1.0 242.0626340626341 7.937365937365911 0.07758886099341475
2.0 231.07378807378808 128.92621192621192 1.2602717342864573
3.0 198.10725010725014 -33.10725010725014 -0.3236279953218475
4.0 209.09609609609612 -166.09609609609612 -1.6236125451746024
5.0 220.0849420849421 -128.0849420849421 -1.2520482040514511
6.0 220.0849420849421 -20.0849420849421 -0.19633311501403666
7.0 209.09609609609612 145.90390390390388 1.426231045377928
8.0 165.14071214071214 124.85928785928786 1.220516983328854
9.0 176.12955812955812 53.87044187044188 0.5265910957012816
10.0 253.05148005148007 -133.05148005148007 -1.3005968065658153
11.0 143.16302016302018 -70.16302016302018 -0.6858533248012663
12.0 220.0849420849421 -15.0849420849421 -0.14745741644748972
13.0 220.0849420849421 179.9150579150579 1.758694827647842
14.0 231.07378807378808 88.92621192621192 0.8692661457540816
15.0 187.11840411840413 -115.11840411840413 -1.1252984838306117
16.0 220.0849420849421 51.9150579150579 0.5074769443442396
17.0 198.10725010725014 -104.10725010725014 -1.0176629149668144
18.0 187.11840411840413 2.881595881595871 0.028168002339896588
19.0 176.12955812955812 58.87044187044188 0.5754667942678285
20.0 198.10725010725014 -59.10725010725014 -0.5777816278678917
6. Continued
Multiple Linear Regression One:
Y= January Heating Costs ($)
X1 = Mean January Outside Temperature (Degrees F)
X2 = Age of Furnace
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.8808337071918113
R Square 0.7758680197252695
Adjusted R Square 0.7494995514576541
Standard Error 52.98236592277907
Observations 20.0
ANOVA
df SS MS F Significance F
Regression 2.0 165194.52132082055 82597.26066041028 29.424083790189602 3.014966842563039E-6
Residual 17.0 47721.22867917943 2807.131098775261
Total 19.0 212915.75
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept 490.2859264274035 44.40984373520935 11.040028182731282 3.5634214322584356E-9 396.5893463057155 583.9825065490916 396.5893463057155 583.9825065490916
X Variable 1 -5.149883965250408 0.7018867096508137 -7.3372011386459475 1.160619161592546E-6 -6.630735479145865 -3.6690324513549504 -6.630735479145865 -3.6690324513549504
X Variable 2 -14.718148491352501 4.933918374976396 -2.983054718942916 0.008350870308799675 -25.127806338635747 -4.308490644069256 -25.127806338635747 -4.308490644069256
RESIDUAL OUTPUT
Observation Predicted Y Residuals Standard Residuals
1.0 265.88554216958175 -15.885542169581754 -0.31697355726816207
2.0 282.0666974697317 77.93330253026829 1.555048979063293
3.0 201.86306423892137 -36.86306423892137 -0.7355503815280433
4.0 92.98399756426407 -49.98399756426407 -0.9973600740405245
5.0 81.95272622936452 10.047273770635485 0.2004791573323606
6.0 262.1986650131288 -62.19866501312879 -1.2410865109970173
7.0 350.47819582678443 4.521804173215571 0.09022621568426435
8.0 307.0552537571256 -17.055253757125627 -0.34031349989797866
9.0 249.67502673497245 -19.675026734972448 -0.3925873695058535
10.0 177.6060113559261 -57.606011355926114 -1.1494465939275393
11.0 35.57441040765153 37.42558959234847 0.7467747804453648
12.0 169.50075363862143 35.49924636137857 0.708337322036852
13.0 313.6975046656329 86.30249533436711 1.7220444008285014
14.0 230.56785781722763 89.43214218277237 1.7844920833779616
15.0 63.54770058155907 8.45229941844093 0.16865369687469584
16.0 313.6975046656329 -41.69750466563289 -0.8320148120836489
17.0 88.5656170034124 5.434382996587601 0.10843543717912267
18.0 166.54537988656722 23.45462011343278 0.46800381707865324
19.0 218.77572294347004 16.224277056529957 0.3237325335083559
20.0 232.76236803042383 -93.76236803042383 -1.8708956241606534
(a) Develop scatterplots between January heating costs and each explanatory variable. Fully evaluate.
(b) Fully evaluate and compare the estimated linear regressions presented above utilizing the diagnostic regression statistics
and statistical tests (α=.05 level) discussed in class.
(c) Which of the estimated regressions (if any) would you utilize to forecast January heating costs? Briefly justify your answer.
(d) Based on your decision related to the "Best" regression equation in part c, generate a forecast. Values of the explanatory variables
available for possible use in your forecast are provided below.
Mean January Outside Temperature (X1) 67.0
Age of Furnace (X2) 11.0
YOU NEED LOTS OF DETAIL AND ANALYSIS TO SCORE WELL ON THIS QUESTION
SEE MY ANSWERS TO EXAM ONE FOR THE LEVEL OF DETAIL RELATED TO REGRSSION EQUATION EVALUATION
You need to email me back your answers (Excel files, PDF, Word files) in Canvas that clearly present your detailed answers.
Since this is a take home exam, I expect a lot of detail in the answers and will set a high bar for grading (Higher Than Exam One)
Due Date: Friday May 11, 2018 at 11:55 p.m. Test answers submitted late will receive a grade of zero.
There are six questions located in the four workbooks.
Questions 2 & 3 are to be done by hand (except scatterplots). Show all work.
Make good use of the Lecture Notes, textbook, and answers to the Problem Sets & Exam One.
HAVE FUN WITH THIS EXAM

Temperature vs. Heating Costs

35 29 36 60 65 30 10 7 21 55 54 48 20 39 60 20 58 40 27 30 250 360 165 43 92 200 355 290 230 120 73 205 400 320 72 272 94 190 235 139

Temperature

Heating Costs

Age of Furnace vs. Heating Costs

3 4 7 6 5 5 6 10 9 2 12 5 5 4 8 5 7 8 9 7 250 360 165 43 92 200 355 290 230 120 73 205 400 320 72 272 94 190 235 139

Age of furnace

Heating Costs

2

R