Statistic 7
Recall, our Car Price data Car Price: Year Years
Old
Observation 1 $ 20,000 2015 4
Observation 2 $ 25,000 2016 3
Observation 3 $ 30,000 2018 1
Observation 4 $ 31,000 2018 1
Observation 5 $ 22,500 2016 3
Observation 6 $ 25,000 2016 3
Observation 7 $ 29,500 2018 1
Observation 8 $ 24,000 2015 4
Observation 9 $ 24,500 2017 2
Observation 10 $ 25,000 2017 2
With the Regression output,
Next, we want to test the hypothesis and see if the results are significant. The hypothesis scenario looks like: Ho: 𝜌 = 0 Ha: 𝜌 ≠ 0 If we look at the p-value or the Significance F we see the p-value = .000673. .000673 < .05, Yes this is significant. This means Years Old is a significant predictor of the Price of a Car.
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.884606501
R Square 0.782528661
Adjusted R Square 0.755344744
Standard Error 1725.490814
Observations 10
ANOVA
df SS MS F Significance F
Regression 1 85706451.61 85706451.61 28.78646 0.000673381
Residual 8 23818548.39 2977318.548
Total 9 109525000
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept 31959.67742 1296.435244 24.65196589 7.83E-09 28970.09239 34949.26245 28970.09239 34949.26245
Years Old -2629.032258 490.0064638 -5.365301179 0.000673 -3758.98919 -1499.075326 -3758.98919 -1499.075326
We can also compare r, the correlation to a critical value. If r < negative critical value or r > positive critical value, then r is significant. If r is significant and the line may be used for prediction. We know r = -.8846. There is a correlation critical value table in RealizeIT under Testing the Hypothesis in Week 7 but here is a link to a more detailed table. Correlation CV Table We know alpha = .05, this is two tailed test from the hypothesis scenario and n = 10. The critical value that corresponds to this in the table is r CV = 0.632. We know our correlation is negative, so we will use the negative value of this. -.8846 < -0.632, this tells us that r is significant and you can use the line for prediction. This is the same conclusion we got with the p-value from above. Lastly, we can run a t-test to see if the data is significant. From the regression output the t-Stat for the slope is -5.3653. But if we didn’t have the regression output we can calculate this value using this equation.
t = 𝑟√𝑛−2
√1−𝑟2
Plugging in our correlation and sample size we get:
t = −.8846√10−2
√1−(−.8846)2 =
− 2.50202
.4663505 = −5.3651
t – Test Stat we calculated by hand is very close to the t-stat in the output. It is a little off because I did round some of my values. Then we can use the =T.DIST.2T function to find the p-value. This Excel function should look familiar. =T.DIST.2T(ABS(-5.3651),8) Remember if you have a negative value you will need to use the ABS function to take the absolute value of it. p-value = 0.000673544 < .05, Yes, this is significant. This is the same conclusion as we got above, and this is the same p-value from the Regression Output. It does not matter what way you use to Test the Hypothesis of a Simple Linear Regression example, if done correctly you will get the same conclusion every time.