Regression Analysis
CedricAli
4
Cedric Alikusumah
ECON 217
March 15, 2017
Regression Analysis Assignment
X1 |
X2 |
X3 |
X4 |
X5 |
X6 |
8 |
78 |
284 |
9.100000381 |
109 |
1 |
9.300000191 |
68 |
433 |
8.699999809 |
144 |
1 |
7.5 |
70 |
739 |
7.199999809 |
113 |
0 |
8.899999619 |
96 |
1792 |
8.899999619 |
97 |
1 |
10.19999981 |
74 |
477 |
8.300000191 |
206 |
0 |
8.300000191 |
111 |
362 |
10.89999962 |
124 |
1 |
8.800000191 |
77 |
671 |
10 |
152 |
0 |
8.800000191 |
168 |
636 |
9.100000381 |
162 |
1 |
10.69999981 |
82 |
329 |
8.699999809 |
150 |
0 |
11.69999981 |
89 |
634 |
7.599999905 |
134 |
1 |
8.5 |
149 |
631 |
10.80000019 |
292 |
0 |
8.300000191 |
60 |
257 |
9.5 |
108 |
0 |
8.199999809 |
96 |
284 |
8.800000191 |
111 |
1 |
7.900000095 |
83 |
603 |
9.5 |
182 |
0 |
10.30000019 |
130 |
686 |
8.699999809 |
129 |
0 |
7.400000095 |
145 |
345 |
11.19999981 |
158 |
0 |
9.600000381 |
112 |
1357 |
9.699999809 |
186 |
1 |
9.300000191 |
131 |
544 |
9.600000381 |
177 |
0 |
10.60000038 |
80 |
205 |
9.100000381 |
127 |
0 |
9.699999809 |
130 |
1264 |
9.199999809 |
179 |
1 |
11.60000038 |
140 |
688 |
8.300000191 |
80 |
0 |
8.100000381 |
154 |
354 |
8.399999619 |
103 |
0 |
9.800000191 |
118 |
1632 |
9.399999619 |
101 |
1 |
7.400000095 |
94 |
348 |
9.800000191 |
117 |
0 |
9.399999619 |
119 |
370 |
10.39999962 |
88 |
1 |
11.19999981 |
153 |
648 |
9.899999619 |
78 |
0 |
9.100000381 |
116 |
366 |
9.199999809 |
102 |
1 |
10.5 |
97 |
540 |
10.30000019 |
95 |
0 |
11.89999962 |
176 |
680 |
8.899999619 |
80 |
0 |
8.399999619 |
75 |
345 |
9.600000381 |
92 |
1 |
5 |
134 |
525 |
10.30000019 |
126 |
0 |
9.800000191 |
161 |
870 |
10.39999962 |
108 |
0 |
9.800000191 |
111 |
669 |
9.699999809 |
77 |
1 |
10.80000019 |
114 |
452 |
9.600000381 |
60 |
0 |
10.10000038 |
142 |
430 |
10.69999981 |
71 |
0 |
10.89999962 |
238 |
822 |
10.30000019 |
86 |
1 |
9.199999809 |
78 |
190 |
10.69999981 |
93 |
0 |
8.300000191 |
196 |
867 |
9.600000381 |
106 |
0 |
7.300000191 |
125 |
969 |
10.5 |
162 |
1 |
9.399999619 |
82 |
499 |
7.699999809 |
95 |
1 |
9.399999619 |
125 |
925 |
10.19999981 |
91 |
1 |
9.800000191 |
129 |
353 |
9.899999619 |
52 |
1 |
3.599999905 |
84 |
288 |
8.399999619 |
110 |
1 |
8.399999619 |
183 |
718 |
10.39999962 |
69 |
0 |
10.80000019 |
119 |
540 |
9.199999809 |
57 |
0 |
10.10000038 |
180 |
668 |
13 |
106 |
0 |
9 |
82 |
347 |
8.800000191 |
40 |
0 |
10 |
71 |
345 |
9.199999809 |
50 |
0 |
11.30000019 |
118 |
463 |
7.800000191 |
35 |
0 |
11.30000019 |
121 |
728 |
8.199999809 |
86 |
1 |
12.80000019 |
68 |
383 |
7.400000095 |
57 |
1 |
10 |
112 |
316 |
10.39999962 |
57 |
1 |
6.699999809 |
109 |
388 |
8.899999619 |
94 |
0 |
The data (X1, X2, X3, X4, X5) are by city. |
|
|||
X1 = death rate per 1000 residents |
|
|||
X2 = doctor availability per 100,000 residents |
||||
X3 = hospital availability per 100,000 residents |
||||
X4 = annual per capita income in thousands of dollars |
||||
X5 = population density people per square mile |
||||
Reference: Life In America's Small Cities, by G.S. Thomas |
||||
|
|
|
|
|
X6=gender where 0-male and 1-female |
|
· Briefly we are going to look at the relationship between X1 (death rate per 1000 residents) with X2 (doctor availability per 100,000 residents), X3 (hospital availability per 100,000 residents), X4 (annual per capita income in thousands of dollars), X5 (population density people per square mile) and X6 (gender).
· Choosing death rate per 1000 residents as the response variables because the other variables (X1, X2, X3, X4, X5 and X6) explain the dependent variable.
· The general model is: X1= a+bX2+cX3+dX4+eX5+fX6, where a, b, c, d, e and f are constants that need to be determined.
· I think the relationship is that the death rate per 1000 residents increases with: decrease in doctor and hospital availability per 100000 residents and decreases in population density per square mile and decreases annual per capita income in thousands of dollars.
· The estimated model is: X1= 12.4472698+0.006859042*X2 + 0.00695302* X3 + -0.336405998*X4 + -0.009687813 *X5 +-0.23263965 *X6.
· The coefficient of determination is 0.148132225 which indicates the total variability in the multiple regression model. It measures the linear association between X1 and the rest of the independent variables. 14.81% of the total variation can be explained by the linear relationship between X1 and the independent variables. If X6(the dummy variable) is not considered the coefficient of determination remains 0.148132225.
· The adjusted R square is 0.057507994
· Using the F-test to check the validity of the model shows the model is statistically acceptable and valid.
· The intercept is 12.4472698 is the point where the line of best fit and the y-axis intercept.
· X2 coefficient is 0.006859042 which means that for each one unit increase in X2 per 100000 residents the death rate per 1000 residents increases by 0.006859042. This also applies to X3 coefficient which is positive and for each increase in X3 per 100000 residents the death rate per 1000 residents increases by 0.000695302.
· X4, X5, X6 have negative coefficient that means for each one unit decrease in either X4,X5,X6 the death rate per 1000 residents decreases by the proportional coefficient.
· P-value is used to check if individual variables are significant (if p-value is greater than 0.05 then the variable is not significant). Hence X5 and X6 are not significant as their individual p-values are greater than 0.05.