regression

profileGman 21$
Week7RegressionTestingtheHypothesis.pdf

Recall, our Car Price data Car Price: Year Years

Old

Observation 1 $ 20,000 2015 4

Observation 2 $ 25,000 2016 3

Observation 3 $ 30,000 2018 1

Observation 4 $ 31,000 2018 1

Observation 5 $ 22,500 2016 3

Observation 6 $ 25,000 2016 3

Observation 7 $ 29,500 2018 1

Observation 8 $ 24,000 2015 4

Observation 9 $ 24,500 2017 2

Observation 10 $ 25,000 2017 2

With the Regression output,

Next, we want to test the hypothesis and see if the results are significant. The hypothesis scenario looks like: Ho: 𝜌 = 0 Ha: 𝜌 ≠ 0 If we look at the p-value or the Significance F we see the p-value = .000673. .000673 < .05, Yes this is significant. This means Years Old is a significant predictor of the Price of a Car.

SUMMARY OUTPUT

Regression Statistics

Multiple R 0.884606501

R Square 0.782528661

Adjusted R Square 0.755344744

Standard Error 1725.490814

Observations 10

ANOVA

df SS MS F Significance F

Regression 1 85706451.61 85706451.61 28.78646 0.000673381

Residual 8 23818548.39 2977318.548

Total 9 109525000

Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%

Intercept 31959.67742 1296.435244 24.65196589 7.83E-09 28970.09239 34949.26245 28970.09239 34949.26245

Years Old -2629.032258 490.0064638 -5.365301179 0.000673 -3758.98919 -1499.075326 -3758.98919 -1499.075326

We can also compare r, the correlation to a critical value. If r < negative critical value or r > positive critical value, then r is significant. If r is significant and the line may be used for prediction. We know r = -.8846. There is a correlation critical value table in RealizeIT under Testing the Hypothesis in Week 7 but here is a link to a more detailed table. Correlation CV Table We know alpha = .05, this is two tailed test from the hypothesis scenario and n = 10. The critical value that corresponds to this in the table is r CV = 0.632. We know our correlation is negative, so we will use the negative value of this. -.8846 < -0.632, this tells us that r is significant and you can use the line for prediction. This is the same conclusion we got with the p-value from above. Lastly, we can run a t-test to see if the data is significant. From the regression output the t-Stat for the slope is -5.3653. But if we didn’t have the regression output we can calculate this value using this equation.

t = 𝑟√𝑛−2

√1−𝑟2

Plugging in our correlation and sample size we get:

t = −.8846√10−2

√1−(−.8846)2 =

− 2.50202

.4663505 = −5.3651

t – Test Stat we calculated by hand is very close to the t-stat in the output. It is a little off because I did round some of my values. Then we can use the =T.DIST.2T function to find the p-value. This Excel function should look familiar. =T.DIST.2T(ABS(-5.3651),8) Remember if you have a negative value you will need to use the ABS function to take the absolute value of it. p-value = 0.000673544 < .05, Yes, this is significant. This is the same conclusion as we got above, and this is the same p-value from the Regression Output. It does not matter what way you use to Test the Hypothesis of a Simple Linear Regression example, if done correctly you will get the same conclusion every time.